Re: PrivateRefCount patch has got issues

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: PrivateRefCount patch has got issues
Date: 2014-12-21 18:21:56
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2014-12-16 18:25:13 -0500, Tom Lane wrote:
>> I just happened to look into bufmgr.c for the first time in awhile, and
>> noticed the privaterefcount-is-no-longer-a-simple-array stuff. It doesn't
>> look too well thought out to me. In particular, PinBuffer_Locked calls
>> GetPrivateRefCountEntry while holding a buffer-header spinlock. That
>> seems completely unacceptable.

> Argh, yes. That certainly isn't ok.

> The easiest way to fix that seems to be to declare that PinBuffer_Locked
> can only be used when we're guaranteed to not have pinned the
> buffer. That happens to be true for all the existing users. In fact all
> of them even seem to require the refcount to be zero across all
> backends. That prerequisite then allows to increase the buffer header
> refcount before releasing the spinlock *and* before increasing the
> private refcount.

Hm, if you do it like that, what happens if we get a palloc failure while
trying to record the private refcount? I think you must not bump the pin
count in shared memory unless you're certain you can record the fact that
you've done so.

The idea I'd been wondering about hinged on the same observation that we
know the buffer is not pinned (by our process) already, but the mechanics
would be closer to what we do in resource managers: reserve space first,
do the thing that needs to be remembered, bump the count using the
reserved space. Given the way you've set this up, the idea boils down to
having a precheck call that forces there to be an empty slot in the local
fastpath array (by pushing something out to the hash table if necessary)
before we start trying to pin the buffer. Then it's guaranteed that the
"record" step will succeed. You could possibly even arrange it so that
it's known which array entry needs to be used and then the "record" part
is just a couple of inline instructions, so that it'd be OK to do that
while still holding the spinlock. Otherwise it would still be a good idea
to do the "record" after releasing the spinlock IMO; but either way this
avoids the issue of possible state inconsistency due to a midflight
palloc failure.

regards, tom lane

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-12-21 18:23:27 Re: Add min and max execute statement time in pg_stat_statement
Previous Message Tom Lane 2014-12-21 18:00:51 Re: PATCH: decreasing memory needlessly consumed by array_agg