Re: [PATCH] Fix minor race in commit_ts SLRU truncation vs lookups

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Fix minor race in commit_ts SLRU truncation vs lookups
Date: 2017-01-23 04:34:27
Message-ID: CAMsr+YH=FCZWfAPUTc5GKTxB6X-beJ6DEoMMN_KNMUP5k3bSGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20 January 2017 at 21:40, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:

> One option would be to add another limit Xid which advances before the
> truncation but which is not used for other decisions other than limiting
> what can users consult.

This could be useful for other things, but it's probably heavier than needed.

What I've done in the latest revision of the txid_status() patch is
simply to advance OldestXid _before_ truncating the clog. The rest of
the xid info is advanced after. Currently this is incorporated into
the txid_status patch, but can be separated if desired.

Relevant commit message portion:

There was previously no way to look up an arbitrary xid without
running the risk of having clog truncated out from under you. This
hasn't been a problem because anything looking up xids in clog knows
they're protected by datminxid, but that's not the case for arbitrary
user-supplied XIDs. clog was truncated before we advance oldestXid so
taking XidGenLock was insufficient. There's no way to look up a
SLRU with soft-failure. To address this, increase oldestXid under XidGenLock
before we trunate clog rather than after, so concurrent access is safe.

Note that while oldestXid is advanced before clog truncation, the xid
limits are advanced _after_ it. If we advanced the xid limits before
truncation too, we'd theoretically run the risk of allocating an xid
from the clog section we're about to truncate, which would be no fun.
(In practice it can't really happen since we only use 1/2 the
available space at a time).

Moving the lower bound up, truncating, and moving the upper bound up
is the way to go IMO.

> Another option is not to implement direct reads
> from the clog.

I think there's a pretty decent argument for having clog lookups;
txid_status(...) serves as a useful halfway position between accepting
indeterminate commit status on connection loss and using full 2PC.

> Yet another option is that before we add such interface
> somebody produces proof that the problem does not, in fact, exist.

It does exist.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-01-23 04:37:04 Logical replication launcher's bgworker enabled by default, and max_logical_replication_workers
Previous Message Craig Ringer 2017-01-23 04:24:38 Re: [PATCH] Fix minor race in commit_ts SLRU truncation vs lookups