Re: Spread checkpoint sync

From: Rob Wultsch <wultsch(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Spread checkpoint sync
Date: 2010-12-05 22:32:28
Message-ID: AANLkTi=g8S7+kW23OcUZb_xGYhgEG_Q=T7XeWd=aMbFU@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Dec 5, 2010 at 2:53 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> Heikki Linnakangas wrote:
>>
>> If you fsync() a file with one dirty page in it, it's going to return very
>> quickly, but a 1GB file will take a while. That could be problematic if you
>> have a thousand small files and a couple of big ones, as you would want to
>> reserve more time for the big ones. I'm not sure what to do about it, maybe
>> it's not a problem in practice.
>
> It's a problem in practice allright, with the bulk-loading situation being
> the main one you'll hit it.  If somebody is running a giant COPY to populate
> a table at the time the checkpoint starts, there's probably a 1GB file of
> dirty data that's unsynced around there somewhere.  I think doing anything
> about that situation requires an additional leap in thinking about buffer
> cache evicition and fsync absorption though.  Ultimately I think we'll end
> up doing sync calls for relations that have gone "cold" for a while all the
> time as part of BGW activity, not just at checkpoint time, to try and avoid
> this whole area better.  That's a lot more than I'm trying to do in my first
> pass of improvements though.
>
> In the interest of cutting the number of messy items left in the official
> CommitFest, I'm going to mark my patch here "Returned with Feedback" and
> continue working in the general direction I was already going.  Concept
> shared, underlying patches continue to advance, good discussion around it;
> those were my goals for this CF and I think we're there.
>
> I have a good idea how to autotune the sync spread that's hardcoded in the
> current patch.  I'll work on finishing that up and organizing some more
> extensive performance tests.  Right now I'm more concerned about finishing
> the tests around the wal_sync_method issues, which are related to this and
> need to get sorted out a bit more urgently.
>
> --
> Greg Smith   2ndQuadrant US    greg(at)2ndQuadrant(dot)com   Baltimore, MD
> PostgreSQL Training, Services and Support        www.2ndQuadrant.us
>

Forgive me, but is all of this a step on the slippery slope to
direction io? And is this a bad thing?

--
Rob Wultsch
wultsch(at)gmail(dot)com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2010-12-05 22:47:07 CommitFest 2010-11: Status report at 2/3 of scheduled time
Previous Message Greg Smith 2010-12-05 22:12:18 Re: [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+