Re: Move PinBuffer and UnpinBuffer to atomics

From: Andres Freund <andres(at)anarazel(dot)de>
To: YUriy Zhuravlev <u(dot)zhuravlev(at)postgrespro(dot)ru>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Move PinBuffer and UnpinBuffer to atomics
Date: 2015-09-11 16:14:21
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 2015-09-11 13:23:24 +0300, YUriy Zhuravlev wrote:
> Continuing the theme:

Please don't just start new threads for a new version of the patch.

> This time, we fairly rewrote 'refcount' and 'usage_count' to atomic in
> PinBuffer and UnpinBuffer (but save lock for buffer flags in Unpin).


> In the same time it doesn't affect to correctness of buffer manager
> because that variables already have LWLock on top of them (for partition of
> hashtable).

Note that there's a pending patch that removes the buffer mapping locks

> If someone pinned buffer after the call StrategyGetBuffer we just try
> again (in BufferAlloc). Also in the code there is one more check
> before deleting the old buffer, where changes can be rolled back. The
> other functions where it is checked 'refcount' and 'usage_count' put
> exclusive locks.

I don't think this is correct. This way we can leave the for (;;) loop
in BufferAlloc() thinking that the buffer is unused (and can't be further
pinned because of the held spinlock!) while it actually has been pinned
since by PinBuffer(). Additionally oldFlags can get out of sync there.

I don't think the approach of making some of the fields atomics but not
really caring about the rest is going to work. My suggestion is to add a
single 'state' 32bit atomic. This 32bit state is subdivided into:

10bit for flags,
3bit for usage_count,
16bit for refcount

then turn each operation that currently uses one of these fields into
corresponding accesses (just different values for flags, bit-shiftery &
mask for reading usage count, bit mask for reading refcount). The trick
then is to add a *new* flag value BM_LOCKED. This can then act as a sort
of a 'one bit' spinlock.

That should roughly look like (more or less pseudocode):

LockBufHdr(BufferDesc *desc)
int state = pg_atomic_read_u32(&desc->state);

for (;;)
/* wait till lock is free */
while (unlikely(state & BM_LOCKED))
state = pg_atomic_read_u32(&desc->state);

/* add exponential backoff? Should seldomly be contended tho. */

/* and try to get lock */
if (pg_atomic_compare_exchange_u32(&desc->state, &state, state | BM_LOCKED))

static bool
PinBuffer(volatile BufferDesc *buf, BufferAccessStrategy strategy)
if (ref == NULL)
ref = NewPrivateRefCountEntry(b);

int state = pg_atomic_read_u32(&desc->state);
int oldstate = state;

while (true)

/* spin-wait till lock is free */
while (unlikely(state & BM_LOCKED))
state = pg_atomic_read_u32(&desc->state);

/* increase refcount */
state += 1;

/* increase usagecount unless already max */

result = (state & BM_VALID) != 0;

if (pg_atomic_compare_exchange_u32(&desc->state, &oldstate, state))

/* get ready for next loop, oldstate has been updated by cas */
state = oldstate;

other callsites can either just plainly continue to use
LockBufHdr/UnlockBufHdr or converted similarly to PinBuffer().


Andres Freund

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2015-09-11 16:15:43 Re: Move PinBuffer and UnpinBuffer to atomics
Previous Message Thom Brown 2015-09-11 16:12:48 Re: [PROPOSAL] VACUUM Progress Checker.