Re: Unstable tests for recovery conflict handling

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Unstable tests for recovery conflict handling
Date: 2022-08-03 17:57:02
Message-ID: 20220803175702.lhqqhcyhuykwbkcu@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Hi,

On 2022-07-26 13:03:54 -0700, Andres Freund wrote:
> On 2022-07-26 14:30:30 -0400, Tom Lane wrote:
> > Andres Freund <andres(at)anarazel(dot)de> writes:
> > > On 2022-07-26 13:57:53 -0400, Tom Lane wrote:
> > >> So this is not a case of RecoveryConflictInterrupt doing the wrong thing:
> > >> the startup process hasn't detected the buffer conflict in the first
> > >> place.
> >
> > > I wonder if this, at least partially, could be be due to the elog thing
> > > I was complaining about nearby. I.e. we decide to FATAL as part of a
> > > recovery conflict interrupt, and then during that ERROR out as part of
> > > another recovery conflict interrupt (because nothing holds interrupts as
> > > part of FATAL).
> >
> > There are all sorts of things one could imagine going wrong in the
> > backend receiving the recovery conflict interrupt, but AFAICS in these
> > failures, the startup process hasn't sent a recovery conflict interrupt.
> > It certainly hasn't logged anything suggesting it noticed a conflict.
>
> I don't think we reliably emit a log message before the recovery
> conflict is resolved.

I played around trying to reproduce this kind of issue.

One way to quickly run into trouble on a slow system is that
ResolveRecoveryConflictWithVirtualXIDs() can end up sending signals more
frequently than the target can process them. The next signal can arrive by the
time SIGUSR1 processing finished, which, at least on linux, causes the queued
signal to immediately be processed, without "normal" postgres code gaining
control.

The reason nothing might get logged in some cases is that
e.g. ResolveRecoveryConflictWithLock() tells
ResolveRecoveryConflictWithVirtualXIDs() to *not* report the waiting:
/*
* Prevent ResolveRecoveryConflictWithVirtualXIDs() from reporting
* "waiting" in PS display by disabling its argument report_waiting
* because the caller, WaitOnLock(), has already reported that.
*/

so ResolveRecoveryConflictWithLock() can end up looping indefinitely without
logging anything.

Another question I have about ResolveRecoveryConflictWithLock() is whether
it's ok that we don't check deadlocks around the
ResolveRecoveryConflictWithVirtualXIDs() call? It might be ok, because we'd
only block if there's a recovery conflict, in which killing the process ought
to succeed?

I think there's also might be a problem with the wait loop in ProcSleep() wrt
recovery conflicts: We rely on interrupts to be processed to throw recovery
conflict errors, but ProcSleep() is called in a bunch of places with
interrupts held. An Assert(INTERRUPTS_CAN_BE_PROCESSED()) after releasing the
partition lock triggers a bunch. It's possible that these aren't problematic
cases for recovery conflicts, because they're all around extension locks:

#2 0x0000562032f1968d in ExceptionalCondition (conditionName=0x56203310623a "INTERRUPTS_CAN_BE_PROCESSED()", errorType=0x562033105f6c "FailedAssertion",
fileName=0x562033105f30 "/home/andres/src/postgresql/src/backend/storage/lmgr/proc.c", lineNumber=1208)
at /home/andres/src/postgresql/src/backend/utils/error/assert.c:69
#3 0x0000562032d50f41 in ProcSleep (locallock=0x562034cafaf0, lockMethodTable=0x562033281740 <default_lockmethod>)
at /home/andres/src/postgresql/src/backend/storage/lmgr/proc.c:1208
#4 0x0000562032d3e2ce in WaitOnLock (locallock=0x562034cafaf0, owner=0x562034d12c58) at /home/andres/src/postgresql/src/backend/storage/lmgr/lock.c:1859
#5 0x0000562032d3cd0a in LockAcquireExtended (locktag=0x7ffc7b4d0810, lockmode=7, sessionLock=false, dontWait=false, reportMemoryError=true, locallockp=0x0)
at /home/andres/src/postgresql/src/backend/storage/lmgr/lock.c:1101
#6 0x0000562032d3c1c4 in LockAcquire (locktag=0x7ffc7b4d0810, lockmode=7, sessionLock=false, dontWait=false)
at /home/andres/src/postgresql/src/backend/storage/lmgr/lock.c:752
#7 0x0000562032d3a696 in LockRelationForExtension (relation=0x7f54646b1dd8, lockmode=7) at /home/andres/src/postgresql/src/backend/storage/lmgr/lmgr.c:439
#8 0x0000562032894276 in _bt_getbuf (rel=0x7f54646b1dd8, blkno=4294967295, access=2) at /home/andres/src/postgresql/src/backend/access/nbtree/nbtpage.c:975
#9 0x000056203288f1cb in _bt_split (rel=0x7f54646b1dd8, itup_key=0x562034ea7428, buf=770, cbuf=0, newitemoff=408, newitemsz=16, newitem=0x562034ea3fc8,
orignewitem=0x0, nposting=0x0, postingoff=0) at /home/andres/src/postgresql/src/backend/access/nbtree/nbtinsert.c:1715
#10 0x000056203288e4bb in _bt_insertonpg (rel=0x7f54646b1dd8, itup_key=0x562034ea7428, buf=770, cbuf=0, stack=0x562034ea1fb8, itup=0x562034ea3fc8, itemsz=16,
newitemoff=408, postingoff=0, split_only_page=false) at /home/andres/src/postgresql/src/backend/access/nbtree/nbtinsert.c:1212
#11 0x000056203288caf9 in _bt_doinsert (rel=0x7f54646b1dd8, itup=0x562034ea3fc8, checkUnique=UNIQUE_CHECK_YES, indexUnchanged=false, heapRel=0x7f546823dde0)
at /home/andres/src/postgresql/src/backend/access/nbtree/nbtinsert.c:258
#12 0x000056203289851f in btinsert (rel=0x7f54646b1dd8, values=0x7ffc7b4d0c50, isnull=0x7ffc7b4d0c30, ht_ctid=0x562034dd083c, heapRel=0x7f546823dde0,
checkUnique=UNIQUE_CHECK_YES, indexUnchanged=false, indexInfo=0x562034ea71c0) at /home/andres/src/postgresql/src/backend/access/nbtree/nbtree.c:200
#13 0x000056203288710b in index_insert (indexRelation=0x7f54646b1dd8, values=0x7ffc7b4d0c50, isnull=0x7ffc7b4d0c30, heap_t_ctid=0x562034dd083c,
heapRelation=0x7f546823dde0, checkUnique=UNIQUE_CHECK_YES, indexUnchanged=false, indexInfo=0x562034ea71c0)
at /home/andres/src/postgresql/src/backend/access/index/indexam.c:193
#14 0x000056203292e9da in CatalogIndexInsert (indstate=0x562034dd02b0, heapTuple=0x562034dd0838)

(gdb) p num_held_lwlocks
$14 = 1
(gdb) p held_lwlocks[0]
$15 = {lock = 0x7f1a0d18d2e4, mode = LW_EXCLUSIVE}
(gdb) p held_lwlocks[0].lock->tranche
$16 = 56
(gdb) p BuiltinTrancheNames[held_lwlocks[0].lock->tranche - NUM_INDIVIDUAL_LWLOCKS]
$17 = 0x558ce5710ede "BufferContent"

Independent of recovery conflicts, isn't it dangerous that we acquire the
relation extension lock with a buffer content lock held? I *guess* it might be
ok because BufferAlloc(P_NEW) only acquires buffer content locks in a
conditional way.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Robert Haas 2022-08-03 20:33:46 Re: Unstable tests for recovery conflict handling
Previous Message Peter Geoghegan 2022-08-03 16:29:22 pgsql: Add another old commit to git-blame-ignore-revs.

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2022-08-03 18:01:48 Re: enable/disable broken for statement triggers on partitioned tables
Previous Message Jacob Champion 2022-08-03 17:52:36 Re: [PATCH] CF app: add "Returned: Needs more interest"