Re: XLogInsert scaling, revisited

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: XLogInsert scaling, revisited
Date: 2013-07-01 13:40:44
Message-ID: 20130701134044.GR11516@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-06-26 18:52:30 +0300, Heikki Linnakangas wrote:
> >* Could you document the way slots prevent checkpoints from occurring
> > when XLogInsert rechecks for full page writes? I think it's correct -
> > but not very obvious on a glance.
>
> There's this in the comment near the top of the file:
>
> * To update RedoRecPtr or fullPageWrites, one has to make sure that all
> * subsequent inserters see the new value. This is done by reserving all the
> * insertion slots before changing the value. XLogInsert reads RedoRecPtr
> and
> * fullPageWrites after grabbing a slot, so by holding all the slots
> * simultaneously, you can ensure that all subsequent inserts see the new
> * value. Those fields change very seldom, so we prefer to be fast and
> * non-contended when they need to be read, and slow when they're changed.
>
> Does that explain it well enough? XLogInsert holds onto a slot while it
> rechecks for full page writes.

I am a bit worried about that logic. We're basically reverting to the
old logic whe xlog writing is an exlusive affair. We will have to wait
for all the other queued inserters before we're finished. I am afraid
that that will show up latencywise.

I have two ideas to improve on that:
a) Queue the backend that does WALInsertSlotAcquire(true) at the front
of the exclusive waiters in *AcquireOne. That should be fairly easy.
b) Get rid of WALInsertSlotAcquire(true) by not relying on
blocking all slot acquiration. I think with some trickery we can do that
safely:
In CreateCheckpoint() we first acquire the insertpos_lck and read
CurrBytePos as a recptr. Set some shared memory variable, say,
PseudoRedoRecPtr, that's now used to check whether backup blocks need to
be made. Release insertpos_lck. Then acquire each slot once, but without
holding the other slots. That guarantees that all XLogInsert()ing
backends henceforth see our PseudoRedoRecPtr value. Then just proceed in
CreateCheckpoint() as we're currently doing except computing RedoRecPtr
under a spinlock.
If a backend reads PseudoRedoRecPtr before we've set RedoRecPtr
accordingly, all that happens is that we possibly have written a FPI too
early.

Makes sense?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-07-01 14:04:33 Re: MVCC catalog access
Previous Message Albe Laurenz 2013-07-01 13:15:44 Re: Review: Display number of changed rows since last analyze