Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Noah Misch <noah(at)leadboat(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data
Date: 2021-08-29 07:27:31
Message-ID: 15936F8A-0436-40C8-9D1A-4A4623681554@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi!

> 15 авг. 2021 г., в 18:45, Andres Freund <andres(at)anarazel(dot)de> написал(а):
>
> Hi,
>
> On 2021-08-15 16:09:37 +0500, Andrey Borodin wrote:
>> From 929736512ebf8eb9ac6ddaaf49b9e6148d72cfbb Mon Sep 17 00:00:00 2001
>> From: Andrey Borodin <amborodin(at)acm(dot)org>
>> Date: Fri, 30 Jul 2021 14:40:16 +0500
>> Subject: [PATCH v12 2/6] PoC fix for race in RelationBuildDesc() and relcache
>> invalidation
>>
>> ---
>> src/backend/utils/cache/relcache.c | 28 ++++++++++++++++++++++++++++
>> 1 file changed, 28 insertions(+)
>>
>> diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
>> index 13d9994af3..7eec7b1f41 100644
>> --- a/src/backend/utils/cache/relcache.c
>> +++ b/src/backend/utils/cache/relcache.c
>> @@ -997,9 +997,16 @@ equalRSDesc(RowSecurityDesc *rsdesc1, RowSecurityDesc *rsdesc2)
>> * (suggesting we are trying to access a just-deleted relation).
>> * Any other error is reported via elog.
>> */
>> +typedef struct InProgressRels {
>> + Oid relid;
>> + bool invalidated;
>> +} InProgressRels;
>> +static InProgressRels inProgress[100];
>> +
>> static Relation
>> RelationBuildDesc(Oid targetRelId, bool insertIt)
>> {
>> + int in_progress_offset;
>> Relation relation;
>> Oid relid;
>> HeapTuple pg_class_tuple;
>> @@ -1033,6 +1040,14 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
>> }
>> #endif
>>
>> + for (in_progress_offset = 0;
>> + OidIsValid(inProgress[in_progress_offset].relid);
>> + in_progress_offset++)
>> + ;
>> + inProgress[in_progress_offset].relid = targetRelId;
>> +retry:
>> + inProgress[in_progress_offset].invalidated = false;
>> +
>> /*
>> * find the tuple in pg_class corresponding to the given relation id
>> */
>> @@ -1213,6 +1228,12 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
>> */
>> heap_freetuple(pg_class_tuple);
>>
>> + if (inProgress[in_progress_offset].invalidated)
>> + goto retry; /* TODO free old one */
>> + /* inProgress is in fact the stack, we can safely remove it's top */
>> + inProgress[in_progress_offset].relid = InvalidOid;
>> + Assert(inProgress[in_progress_offset + 1].relid == InvalidOid);
>> +
>> /*
>> * Insert newly created relation into relcache hash table, if requested.
>> *
>> @@ -2805,6 +2826,13 @@ RelationCacheInvalidateEntry(Oid relationId)
>> relcacheInvalsReceived++;
>> RelationFlushRelation(relation);
>> }
>> + else
>> + {
>> + int i;
>> + for (i = 0; OidIsValid(inProgress[i].relid); i++)
>> + if (inProgress[i].relid == relationId)
>> + inProgress[i].invalidated = true;
>> + }
>> }
>
> Desperately needs comments. Without a commit message and without
> comments it's hard to review this without re-reading the entire thread -
> which approximately nobody will do.
I've added some comments. But it seems we should use dynamic allocations instead of 100-based array.

>
>
>> From 7e47dae2828d88ddb2161fda0c3b08a158c6cf37 Mon Sep 17 00:00:00 2001
>> From: Andrey Borodin <amborodin(at)acm(dot)org>
>> Date: Sat, 7 Aug 2021 20:27:14 +0500
>> Subject: [PATCH v12 5/6] PoC fix clear xid
>>
>> ---
>> src/backend/access/transam/xact.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
>> index 387f80419a..9b19d939eb 100644
>> --- a/src/backend/access/transam/xact.c
>> +++ b/src/backend/access/transam/xact.c
>> @@ -2500,7 +2500,6 @@ PrepareTransaction(void)
>> * done *after* the prepared transaction has been marked valid, else
>> * someone may think it is unlocked and recyclable.
>> */
>> - ProcArrayClearTransaction(MyProc);
>>
>> /*
>> * In normal commit-processing, this is all non-critical post-transaction
>> @@ -2535,6 +2534,8 @@ PrepareTransaction(void)
>> PostPrepare_MultiXact(xid);
>>
>> PostPrepare_Locks(xid);
>> +
>> + ProcArrayClearTransaction(MyProc);
>> PostPrepare_PredicateLocks(xid);
>
> The comment above ProcArrayClearTransaction would need to be moved and
> updated...
Fixed.

>
>> From 6db9cafd146db1a645bb6885157b0e1f032765e0 Mon Sep 17 00:00:00 2001
>> From: Andrey Borodin <amborodin(at)acm(dot)org>
>> Date: Mon, 19 Jul 2021 11:50:02 +0500
>> Subject: [PATCH v12 4/6] Fix CREATE INDEX CONCURRENTLY in precence of vxids
>> converted to 2pc
> ...
>
>> +/*
>> + * TwoPhaseGetXidByVXid
>> + * Try to lookup for vxid among prepared xacts
>> + */
>> +XidListEntry
>> +TwoPhaseGetXidByVXid(VirtualTransactionId vxid)
>> +{
>> + int i;
>> + XidListEntry result;
>> + result.next = NULL;
>> + result.xid = InvalidTransactionId;
>> +
>> + LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
>> +
>> + for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
>> + {
>> + GlobalTransaction gxact = TwoPhaseState->prepXacts[i];
>> + PGPROC *proc = &ProcGlobal->allProcs[gxact->pgprocno];
>> + VirtualTransactionId proc_vxid;
>> + GET_VXID_FROM_PGPROC(proc_vxid, *proc);
>> +
>> + if (VirtualTransactionIdEquals(vxid, proc_vxid))
>> + {
>> + if (result.xid != InvalidTransactionId)
>> + {
>> + /* Already has candidate - need to alloc some space */
>> + XidListEntry *copy = palloc(sizeof(XidListEntry));
>> + copy->next = result.next;
>> + copy->xid = result.xid;
>> + result.next = copy;
>> + }
>> + result.xid = gxact->xid;
>> + }
>> + }
>> +
>> + LWLockRelease(TwoPhaseStateLock);
>> +
>> + return result;
>> +}
>
> Dynamic memory allocations while holding a fairly central lock - one
> which is now going to be more contended - doesn't seem great.
>
> Is the memory context this is called in guaranteed to be of a proper
> duration? Including being reset in case there's an error at some point
> before the memory is freed?
I've removed custom list and all memory allocations. If there are multiple 2PCs with same vxid - we just wait for one, then retry.

>> +/*
>> + * WaitXact
>> + *
>> + * Wait for xid completition if have xid. Otherwise try to find xid among
>> + * two-phase procarray entries.
>> + */
>> +static bool WaitXact(VirtualTransactionId vxid, TransactionId xid, bool wait)
>> +{
>> + LockAcquireResult lar;
>> + LOCKTAG tag;
>> + XidListEntry xidlist;
>> + XidListEntry *xidlist_ptr = NULL; /* pointer to TwoPhaseGetXidByVXid()s pallocs */
>> + bool result;
>> +
>> + if (TransactionIdIsValid(xid))
>> + {
>> + /* We know exact xid - no need to search in 2PC state */
>> + xidlist.xid = xid;
>> + xidlist.next = NULL;
>> + }
>> + else
>> + {
>> + /* You cant have vxid among 2PCs if you have no 2PCs */
>> + if (max_prepared_xacts == 0)
>> + return true;
>> +
>> + /*
>> + * We must search for vxids in 2pc state
>> + * XXX: O(N*N) complexity where N is number of prepared xacts
>> + */
>> + xidlist = TwoPhaseGetXidByVXid(vxid);
>> + /* Return if transaction is gone entirely */
>> + if (!TransactionIdIsValid(xidlist.xid))
>> + return true;
>> + xidlist_ptr = xidlist.next;
>> + }
>
> Perhaps I missed this - but won't we constantly enter this path for
> non-2pc transactions? E.g.
I've restored heuristics that if it's not 2PC - we just exit from WaitPreparedXact().

>
>> @@ -4573,7 +4649,7 @@ VirtualXactLock(VirtualTransactionId vxid, bool wait)
>> */
>> proc = BackendIdGetProc(vxid.backendId);
>> if (proc == NULL)
>> - return true;
>> + return WaitXact(vxid, InvalidTransactionId, wait);
>>
>> /*
>> * We must acquire this lock before checking the backendId and lxid
>> @@ -4587,9 +4663,12 @@ VirtualXactLock(VirtualTransactionId vxid, bool wait)
>> || proc->fpLocalTransactionId != vxid.localTransactionId)
>> {
>> LWLockRelease(&proc->fpInfoLock);
>> - return true;
>> + return WaitXact(vxid, InvalidTransactionId, wait);
>> }
>
> It seems like it's going to add a substantial amount of work even when
> no 2PC xacts are involved?
Only if 2PCs are enabled.

>> diff --git a/src/include/access/twophase.h b/src/include/access/twophase.h
>> index e27e1a8fe8..a5f28d3a80 100644
>> --- a/src/include/access/twophase.h
>> +++ b/src/include/access/twophase.h
>> @@ -25,6 +25,17 @@
>> */
>> typedef struct GlobalTransactionData *GlobalTransaction;
>>
>> +/*
>> + * XidListEntry is expected to be used as list very rarely. Under normal
>> + * circumstances TwoPhaseGetXidByVXid() returns only one xid.
>> + * But under certain conditions can return many xids or nothing.
>> + */
>> +typedef struct XidListEntry
>> +{
>> + TransactionId xid;
>> + struct XidListEntry* next;
>> +} XidListEntry;
>
> I don't think we should invent additional ad-hoc list types.
Fixed, removed list here entirely.

I'm attaching new version. It works fine on my machines.

Thanks!

Best regards, Andrey Borodin.

Attachment Content-Type Size
v13-0001-Introduce-TAP-test-for-CIC.patch application/octet-stream 5.6 KB
v13-0006-Do-CIC-test-time-based-to-ensure-bug-repro.patch application/octet-stream 1.6 KB
v13-0005-Reorder-steps-of-2PC-preparation.patch application/octet-stream 1.7 KB
v13-0004-Fix-CREATE-INDEX-CONCURRENTLY-in-precence-of-vxi.patch application/octet-stream 7.1 KB
v13-0003-Introduce-TAP-test-for-2PC-with-CIC-behavior.patch application/octet-stream 4.6 KB
v13-0002-PoC-fix-for-race-in-RelationBuildDesc-and-relcac.patch application/octet-stream 3.3 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2021-08-29 18:09:13 Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data
Previous Message Yongqian Li 2021-08-29 01:04:28 Unexpected behavior from using default config value