From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Race condition in server-crash testing |
Date: | 2022-04-06 00:46:01 |
Message-ID: | 2958325.1649205961@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2022-04-04 00:50:27 -0400, Tom Lane wrote:
>> It's hard to be totally sure, but I think what happened is that
>> gaur hit the in-hindsight-obvious race condition in this code:
>> we managed to execute a successful iteration of poll_query_until
>> before the postmaster had noticed its dead child and commenced
>> the restart. The test lines after these are not prepared to see
>> failure-to-connect.
>> It's not obvious to me how to remove this race condition.
>> Thoughts?
> Maybe we can use pump_until() with the psql that's not getting killed? With a
> non-matching regex? That'd only return once the backend was killed by
> postmaster, afaics?
Good idea. What I actually did was to borrow the recently-fixed code
in 013_crash_restart.pl that checks for psql's "connection lost"
report.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tatsuo Ishii | 2022-04-06 01:00:05 | Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors |
Previous Message | Thom Brown | 2022-04-06 00:42:48 | Re: [COMMITTERS] pgsql: Allow time delayed standbys and recovery |