Quick Links

Re: Spread checkpoint sync

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Greg Smith <greg(at)2ndquadrant(dot)com>
Cc:	Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Spread checkpoint sync
Date:	2011-01-12 01:27:36
Message-ID:	AANLkTi=P0te3oFq0LVS8cGLkGF_Wp9ery0fOu9SHEcs9@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Nov 30, 2010 at 3:29 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> Having the pg_stat_bgwriter.buffers_backend_fsync patch available all the
> time now has made me reconsider how important one potential bit of
> refactoring here would be. I managed to catch one of the situations where
> really popular relations were being heavily updated in a way that was
> competing with the checkpoint on my test system (which I can happily share
> the logs of), with the instrumentation patch applied but not the spread sync
> one:
>
> LOG: checkpoint starting: xlog
> DEBUG: could not forward fsync request because request queue is full
> CONTEXT: writing block 7747 of relation base/16424/16442
> DEBUG: could not forward fsync request because request queue is full
> CONTEXT: writing block 42688 of relation base/16424/16437
> DEBUG: could not forward fsync request because request queue is full
> CONTEXT: writing block 9723 of relation base/16424/16442
> DEBUG: could not forward fsync request because request queue is full
> CONTEXT: writing block 58117 of relation base/16424/16437
> DEBUG: could not forward fsync request because request queue is full
> CONTEXT: writing block 165128 of relation base/16424/16437
> [330 of these total, all referring to the same two relations]
>
> DEBUG: checkpoint sync: number=1 file=base/16424/16448_fsm
> time=10132.830000 msec
> DEBUG: checkpoint sync: number=2 file=base/16424/11645 time=0.001000 msec
> DEBUG: checkpoint sync: number=3 file=base/16424/16437 time=7.796000 msec
> DEBUG: checkpoint sync: number=4 file=base/16424/16448 time=4.679000 msec
> DEBUG: checkpoint sync: number=5 file=base/16424/11607 time=0.001000 msec
> DEBUG: checkpoint sync: number=6 file=base/16424/16437.1 time=3.101000 msec
> DEBUG: checkpoint sync: number=7 file=base/16424/16442 time=4.172000 msec
> DEBUG: checkpoint sync: number=8 file=base/16424/16428_vm time=0.001000
> msec
> DEBUG: checkpoint sync: number=9 file=base/16424/16437_fsm time=0.001000
> msec
> DEBUG: checkpoint sync: number=10 file=base/16424/16428 time=0.001000 msec
> DEBUG: checkpoint sync: number=11 file=base/16424/16425 time=0.000000 msec
> DEBUG: checkpoint sync: number=12 file=base/16424/16437_vm time=0.001000
> msec
> DEBUG: checkpoint sync: number=13 file=base/16424/16425_vm time=0.001000
> msec
> LOG: checkpoint complete: wrote 3032 buffers (74.0%); 0 transaction log
> file(s) added, 0 removed, 0 recycled; write=1.742 s, sync=10.153 s,
> total=37.654 s; sync files=13, longest=10.132 s, average=0.779 s
>
> Note here how the checkpoint was hung on trying to get 16448_fsm written
> out, but the backends were issuing constant competing fsync calls to these
> other relations. This is very similar to the production case this patch was
> written to address, which I hadn't been able to share a good example of yet.
> That's essentially what it looks like, except with the contention going on
> for minutes instead of seconds.
>
> One of the ideas Simon and I had been considering at one point was adding
> some better de-duplication logic to the fsync absorb code, which I'm
> reminded by the pattern here might be helpful independently of other
> improvements.

Hopefully I'm not stepping on any toes here, but I thought this was an
awfully good idea and had a chance to take a look at how hard it would
be today while en route from point A to point B. The answer turned
out to be "not very", so PFA a patch that seems to work. I tested it
by attaching gdb to the background writer while running pgbench, and
it eliminate the backend fsyncs without even breaking a sweat.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment	Content-Type	Size
compact-fsync-request-queue.patch	text/x-patch	4.3 KB

In response to

Re: Spread checkpoint sync at 2010-11-30 20:29:57 from Greg Smith

Responses

Re: Spread checkpoint sync at 2011-01-15 10:47:24 from Greg Smith
Re: Spread checkpoint sync at 2011-01-17 00:32:55 from Jeff Janes

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2011-01-12 01:31:29	Re: LOCK for non-tables
Previous Message	Andrew Dunstan	2011-01-12 00:55:49	Re: arrays as pl/perl input arguments [PATCH]