Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

From: Noah Misch <noah(at)leadboat(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data
Date: 2021-09-25 20:10:12
Message-ID: 20210925201012.GC4134968@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sat, Sep 25, 2021 at 10:25:05PM +0500, Andrey Borodin wrote:
> > 20 сент. 2021 г., в 09:41, Noah Misch <noah(at)leadboat(dot)com> написал(а):
> > On Mon, Aug 23, 2021 at 10:38:00PM +0500, Andrey Borodin wrote:
> >> --- a/src/backend/access/transam/twophase.c
> >> +++ b/src/backend/access/transam/twophase.c
> >> @@ -459,14 +459,15 @@ MarkAsPreparingGuts(GlobalTransaction gxact, TransactionId xid, const char *gid,
> >> proc->pgprocno = gxact->pgprocno;
> >> SHMQueueElemInit(&(proc->links));
> >> proc->waitStatus = PROC_WAIT_STATUS_OK;
> >> - /* We set up the gxact's VXID as InvalidBackendId/XID */
> >> - proc->lxid = (LocalTransactionId) xid;
> >> + /* We set up the gxact's VXID as real for CIC purposes */
> >> + proc->lxid = MyProc->lxid;
> >
> > This breaks the case where the server restarted after PREPARE TRANSACTION.
> > MyProc->lxid is 0 in the startup process, and LocalTransactionIdIsValid(0) is
> > false. I'm attaching a test case addition. Can you repair this?
> Yup. Indeed, that's a bug. Root cause it that GetLockConflicts() does not try to extract real xid from gxact's PGPROC, while vxid is not valid.
> I see two ways to solve this:
> 1. Always set valid vxid, but fake 'vxid from xid' for gxact.
> 2. Teach GetLockConflicts() to use xid if vxid is invalid.
> Both ways lead to identical GetLockConflicts() output.
> PFA implementation of approach 1.

That's reasonable. I'll queue a task to review the rest of the patch.

> > On Tue, Sep 07, 2021 at 11:45:15PM -0700, Noah Misch wrote:
> >> On Sun, Aug 29, 2021 at 11:38:03PM +0500, Andrey Borodin wrote:
> >>>> 29 авг. 2021 г., в 23:09, Andres Freund <andres(at)anarazel(dot)de> написал(а):
> >>>>>> It seems like it's going to add a substantial amount of work even when
> >>>>>> no 2PC xacts are involved?
> >>>>> Only if 2PCs are enabled.
> >>>>
> >>>> I don't think that's good enough. Plenty of systems have 2PC enabled but very
> >>>> few if any transactions end up as 2PC ones.
> >>
> >>> Best optimisation I can imagine here is to gather all vxids with unknown xids and search for them in one call to TwoPhaseGetXidByVXid() with one LWLockAcquire(TwoPhaseStateLock, LW_SHARED).
> >>>
> >>> Does it worth the complexity?
> >>
> >> https://www.postgresql.org/search/?m=1&q=TwoPhaseStateLock&l=&d=-1&s=r
> >> suggests this is the first postgresql.org discussion of TwoPhaseStateLock as a
> >> bottleneck. Nonetheless, if Andres Freund finds it's worth the complexity,
> >> then I'm content with it. I'd certainly expect some performance benefit.
> >> Andres, what do you think?
> >
> > A few more benefits (beyond lock contention) come to mind:
> >
> > - Looking at the three VirtualXactLock() callers, waiting for final
> > disposition of prepared transactions is necessary for
> > WaitForLockersMultiple(), disadvantageous for WaitForOlderSnapshots(), and
> > dead code for ResolveRecoveryConflictWithVirtualXIDs(). In
> > WaitForOlderSnapshots(), PREPARE is as good as COMMIT/ABORT, because a
> > prepared transaction won't do further database reads. Waiting on the
> > prepared transaction there could give CIC an arbitrarily-long, needless
> > delay. ResolveRecoveryConflictWithVirtualXIDs() will never wait on a
> > prepared transaction, because prepared transactions hold no locks during
> > recovery. (If a prepared transaction originally acquired
> > AccessExclusiveLock, the startup process holds that lock on its behalf.)
> > Coordinating the XID search at a higher layer would let us change
> > WaitForLockersMultiple() without changing the others.
> BTW WaitForOlderSnapshots() is used in ATExecDetachPartitionFinalize(). Probably, we could indicate to VirtualXactLock() if we want to wait on 2PC or not. Does it make sense?

For both CIC and ATExecDetachPartitionFinalize(), one needs unlucky timing for
the operation to needlessly wait on a prepared xact. Specifically, the
needless wait arises if a PREPARE happens while WaitForOlderSnapshots() is
running, after its GetCurrentVirtualXIDs() and before its VirtualXactLock().
I would not choose to "indicate to VirtualXactLock() if we want to wait on 2PC
or not", but code like that wouldn't be too bad. I probably wouldn't remove
code like that if you chosen to write it.

The alternative I have in mind would work like the following pseudocode. I'm
leaning toward thinking it's not worth doing, since none of the three benefits
are known to be important. But maybe it is worth doing.

struct LockConflictData
{
/* VXIDs seen to have XIDs */
List *vxid_of_xid;
/* VXIDs without known XIDs */
List *vxid_no_xid;
/*
* XIDs. Has one entry per entry in vxid_of_xid, and it may have up to
* max_prepared_transactions additional entries.
*/
List *xid;
};
void
WaitForLockersMultiple(List *locktags, LOCKMODE lockmode, bool progress)
{
struct LockConflictData holders;
List *latest_xid = NIL;
List *need_mapping = NIL;
ListCell *lc;
int total = 0;
int done = 0;

/* Collect the transactions we need to wait on */
foreach(lc, locktags)
{
LOCKTAG *locktag = lfirst(lc);
int count;

GetLockConflicts(&holders, locktag, lockmode,
progress ? &count : NULL);
}

/* wait on VXIDs known to have XIDs, and wait on known XIDs */
foreach(lc, holders.vxid_of_xid)
VirtualXactLock(lfirst_int(lc), true, NULL);
foreach(lc, holders.xid)
XactLockTableWait(lfirst_int(lc), other_params);
/* wait on remaining VXIDs, possibly discovering more XIDs */
foreach(lc, holders.vxid_no_xid)
{
VirtualTransactionId *v = lfirst(lc);
TransactionId xid = InvalidTransactionId;
/*
* Under this design, VirtualXactLock() would know nothing about 2PC,
* but it would gain the ability to return proc->xid of the waited
* proc. Under this design, VirtualTransactionId is always a local
* transaction, like it was before commit 8a54e12.
*/
VirtualXactLock(*v, true, &xid);
if (TransactionIdIsValid(xid))
latest_xid = lappend_int(late_xid, xid);
else
need_mapping = lappend_int(need_mapping, v);
}
/* wait on XIDs just discovered */
foreach(lc, latest_xid)
XactLockTableWait(lfirst_int(lc), other_params);
/*
* If we never saw an XID associated with a particular VXID, check whether
* the VXID became a prepared xact.
*/
latest_xid = TwoPhaseGetXidByVXid(need_mapping);
foreach(lc, latest_xid)
XactLockTableWait(lfirst_int(lc), other_params);
}

> > - v13 WaitPreparedXact() experiences starvation when a steady stream of
> > prepared transactions have the same VXID. Since VXID reuse entails
> > reconnecting, starvation will be unnoticeable in systems that follow best
> > practices around connection lifespan. The 2021-08-23 patch version didn't
> > have that hazard.
> I think the probability of such a stream is abysmal. You not only need a stream of 2PC with the same vxid, but a stream of overlapping 2PC with the same vxid. And the most critical thing that can happen - CIC waiting for the stream to became one-2PC-at-a-time for a moment.

You're probably right about that.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Etsuro Fujita 2021-09-26 08:56:57 Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Previous Message Andrey Borodin 2021-09-25 17:25:05 Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data