Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?
Date: 2022-06-22 02:33:01
Message-ID: 20220622023301.pwgy2tsxcc2ypzao@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-06-21 17:22:05 +1200, Thomas Munro wrote:
> Problem: I saw 031_recovery_conflict.pl time out while waiting for a
> buffer pin conflict, but so far once only, on CI:
>
> https://cirrus-ci.com/task/5956804860444672
>
> timed out waiting for match: (?^:User was holding shared buffer pin
> for too long) at t/031_recovery_conflict.pl line 367.
>
> Hrmph. Still trying to reproduce that, which may be a bug in this
> patch, a bug in the test or a pre-existing problem. Note that
> recovery didn't say something like:
>
> 2022-06-21 17:05:40.931 NZST [57674] LOG: recovery still waiting
> after 11.197 ms: recovery conflict on buffer pin
>
> (That's what I'd expect to see in
> https://api.cirrus-ci.com/v1/artifact/task/5956804860444672/log/src/test/recovery/tmp_check/log/031_recovery_conflict_standby.log
> if the startup process had decided to send the signal).
>
> ... so it seems like the problem in that run is upstream of the interrupt stuff.

Odd. The only theory I have so far is that the manual vacuum on the primary
somehow decided to skip the page, and thus didn't trigger a conflict. Because
clearly replay progressed past the records of the VACUUM. Perhaps we should
use VACUUM VERBOSE? In contrast to pg_regress tests that should be
unproblematic?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2022-06-22 02:49:50 Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
Previous Message Thomas Munro 2022-06-22 02:09:08 Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?