Re: [PATCH] Transaction traceability - txid_status(bigint)

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Subject: Re: [PATCH] Transaction traceability - txid_status(bigint)
Date: 2016-12-21 08:02:15
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers

On 27 September 2016 at 09:23, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> On 21 September 2016 at 22:16, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> It might not be too hard to add a second copy of oldestXid in shared
>> memory that is updated before truncation rather than afterward... but
>> yeah, like you, I'm not finding this nearly as straightforward as
>> might have been hoped.
> Yeah.
> I suspect that'll be the way to go, to add another copy that's updated
> before clog truncation. It just seems ... unclean. Like it shouldn't
> be necessary, something else isn't right. But it's probably the lowest
> pain option.

OK, so summarizing the problem:

slru.c and clog.c have no soft-failure entrypoints where we can look
it up and fail gracefully if the xid isn't in the clog.
vac_truncate_clog() calls TruncateCLOG to chop SLRU pages from the
clog before it takes XidGenLock to advance oldestXid. So we cannot use
oldestXid protected by XidGenLock as an interlock against concurrent
clog truncation to prevent SLRU access errors looking up an xid, and
there's no other candidate lock.

This hasn't been an issue before because nobody looks up arbitrary
user-supplied XIDs in clog, we only look up things that're protected
by datfrozenxid etc. But the whole point of txid_status is to look up
user-supplied xids.

We can't just take ClogControlLock from txid_status() to block
truncation because clog.c expects to own that lock, and takes it (via
slru.c) in TransactionIdGetStatus, with no way to pass an
already_locked state. We'd self-deadlock.

Adding a second copy of oldestXid - say pendingOldestXid - won't
actually help us unless we also take some suitable LWLock before
updating it, otherwise truncation can continue after we look at
pendingOldestXid but before we do the clog lookup. That means an extra
LWLock for each clog truncation, but compared to the I/O done during
clog truncation and the cost of the SlruScanDirectory() pre-check done
by TruncateCLOG it's nothing.

We could take XidGenLock twice, once to update this new
pendingOldestXid field and once to update oldestXid, but that's a
highly contested lock I'd rather not mess with even on a path that's
not hit much.

Instead, I've added a new LWLock, ClogTruncationLock, for that
purpose. vac_truncate_clog() takes it if it decides to attempt clog
truncation. This lock is held throughout the whole process of clog
truncation and oldestXid advance, so there's no need for a new
pendingOldestXid field in ShmemVariableCache. We just take the lock
then look at oldestXid, knowing that it's guaranteed to correspond to
an existing clog page that won't go away while we're looking.
ClogTruncationLock is utterly uncontested so it's going to have no
meaningful impact compared to all the work we do scanning the clog to
decide whether we're even going to try truncating it, etc.

(BTW, it seems like a pity that lwlocknames.txt doesn't have comments
on each lock. We could have
src/backend/storage/lmgr/ transform the #
comments into /* comments */ on the generated header. Thoughts?)

Craig Ringer
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Introduce-txid_status-bigint-to-get-status-of-an-xac.patch text/x-patch 17.7 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2016-12-21 08:07:24 Re: bigint vs txid user confusion
Previous Message Andreas Karlsson 2016-12-21 07:55:08 Re: Why does plpython delay composite type resolution?