Re: XLogInsert

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: XLogInsert
Date: 2009-09-14 03:42:11
Message-ID: f67928030909132042v432f7194le9e0eab2d1156b21@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 19, 2009 at 9:49 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> > If I read the code correctly, the only thing that is irrevocable is
> > that it writes into
> > rdt->next, and if it saved an old copy of rdt first, then it could
> > revoke the changes just
> > by doing rdt_old->next=NULL. If that were done, then I think this
> > code could be
> > moved out of the section holding the WALInsertLock.
>
> Hmm, I recall that the changes are ... or were ... more complex.
> The tricky case I think is where we have to go back and redo the
> block-backup decisions after discovering that the checkpoint REDO
> pointer has just moved.
>
> If you can get the work out of the WALInsertLock section for just a
> few more instructions, it would definitely be worth doing.
>

I've attached a patch which removes the iteration over the blocks to be
backed-up from the critical section of XLogInsert. Now those blocks are
only looped over in one piece of code which both computes the CRC and builds
the linked list, rather than having parallel loops.

I've used an elog statement (not shown in patch) to demonstrate that the
"goto begin;" after detecting REDO race actually does get executed under a
standard workload, (pgbench -c10). Two to 4 out of 10 the backends execute
that code path for each checkpoint on my single CPU machine. By doing a
kill -9 on a process, to simulate a crash, during the period after the goto
begin is execercised but before the precipitating heckpoint completes, I can
force it to use the written WAL records in recovery. The database
automatically recovers and the results are self-consistent.

I cannot imagine any other races, rare events, or action at a distance that
could come into play with this code change, so I cannot think of anything
else to test at the moment.

I could not detect a speed difference with pgbench, but as I cannot get
pgbench to be XLogInsert bound, that is not surprising. Using the only
XLogInsert-bound test case I know of, parallel COPY into a skinny, unindexed
table, using 8 parallel copies on a 4 x dual-core x86_64 and with fsync
turned off (to approxiamately simulate SSD, which I do not have), I get a
speed improvement of 2-4% with the patch over unpatched head. Maybe with
more CPUs the benefit would be greater.

That small improvement is probably not very attractive, however I think the
patch makes the overall code a bit cleaner, so it may be warranted on that
ground. Indeed, my motivation for working on this is that I kept beating my
head against the complexity of the old code, and thought that simplifying it
would make future work easier.

Cheers,

Jeff

Attachment Content-Type Size
XLogInsert_rdt2_v1.patch text/x-diff 6.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2009-09-14 03:46:46 Re: Rough draft: easier translation of psql help
Previous Message Jeff Janes 2009-09-14 03:09:04 test_fsync file overrun