Re: remove spurious CREATE INDEX CONCURRENTLY wait

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: remove spurious CREATE INDEX CONCURRENTLY wait
Date: 2020-11-24 00:40:56
Message-ID: 20201124004056.pabj4vzibbx5bwmp@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-11-23 12:30:05 -0300, Alvaro Herrera wrote:
> ProcSleep: no bug here.
> The flags are checked to see if we should kill() the process (used when
> autovac blocks some other process). The lock here appears to be
> inconsequential, since we release it before we do kill(); so strictly
> speaking, there is still a race condition where the autovac process
> could reset the flag after we read it and before we get a chance to
> signal it. The lock level autovac uses to set the flag is not relevant
> either.

Yea. Even before the recent changes building the lock message under the
lock didn't make sense. Now that the flags are mirrored in pgproc, we
probably should just make this use READ_ONCE(autovac->statusFlags),
without any other use of ProcArrayLock. As you say the race condition
is between the flag test, the kill, and the signal being processed,
which wasn't previously protected either.

> PROC_IN_LOGICAL_DECODING:
> Oddly enough, I think the reset of PROC_IN_LOGICAL_DECODING in
> ReplicationSlotRelease might be the most problematic one of the lot.
> That's because a proc's xmin that had been ignored all along by
> ComputeXidHorizons, will now be included in the computation. Adding
> asserts that proc->xmin and proc->xid are InvalidXid by the time we
> reset the flag, I got hits in pg_basebackup, test_decoding and
> subscription tests. I think it's OK for ComputeXidHorizons (since it
> just means that a vacuum that reads a later will remove less rows.) But
> in GetSnapshotData it is just not correct to have the Xmin go backwards.

I don't think there's a problem. PROC_IN_LOGICAL_DECODING can only be
set when outside a transaction block, i.e. walsender. In which case
there shouldn't be an xid/xmin, I think? Or did you gate your assert on
PROC_IN_LOGICAL_DECODING being set?

> GetCurrentVirtualXIDs, ComputeXidHorizons, GetSnapshotData:
>
> In these cases, what we need is that the code computes some xmin (or
> equivalent computation) based on a set of transactions that exclude
> those marked with the flags. The behavior we want is that if some
> transaction is marked as vacuum, we ignore the Xid/Xmin *if there is
> one*. In other words, if there's no Xid/Xmin, then the flag is not
> important. So if we can ensure that the flag is set first, and the
> Xid/xmin is installed later, that's sufficient correctness and we don't
> need to hold exclusive lock.

That'd require at least memory barriers in GetSnapshotData()'s loop,
which I'd really like to avoid. Otherwise the order in which memory gets
written in one process doesn't guarantee the order of visibility in
another process...

> In other words, my conclusion is that there definitely is a bug here and
> I am going to restore the use of exclusive lock for setting the
> statusFlags.

Cool.

> GetSnapshotData has an additional difficulty -- we do the
> UINT32_ACCESS_ONCE(ProcGlobal->xid[i]) read *before* testing the bit.
> So it's not valid to set the bit without locking out GSD, regardless of
> any barriers we had; if we want this to be safe, we'd have to change
> this so that the flag is read first, and we read the xid only
> afterwards, with a read barrier.

Which we don't want, because that'd mean slowing down the *extremely*
common case of the majority of backends neither having an xid assigned
nor doing logical decoding / vacuum. We'd be reading two cachelines
instead of one.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Nancarrow 2020-11-24 00:44:48 Re: Parallel plans and "union all" subquery
Previous Message David Rowley 2020-11-24 00:12:09 Re: Keep elog(ERROR) and ereport(ERROR) calls in the cold path