Re: windows CI failing PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: windows CI failing PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED
Date: 2023-03-15 08:00:00
Message-ID: 354d6027-e33f-ad66-6c48-27ca8d2458ca@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,
14.03.2023 01:20, Andres Freund wrote:
>> I am yet to construct a reproduction of the case, but it seems to me that
>> the race condition is not impossible here.
> I suspect the issue could be made much more likely by adding a sleep before
> the pg_queue_signal(SIGCHLD) in pgwin32_deadchild_callback().

Thanks for the tip! With pg_usleep(50000) added there, I can reproduce the issue
reliably during a minute on average with the 099_check_pids.pl I posted before:
...
2023-03-15 07:26:14.301 GMT|[unknown]|[unknown]|3748|64117316.ea4|LOG: 
connection received: host=127.0.0.1 port=49902
2023-03-15 07:26:14.302 GMT|postgres|postgres|3748|64117316.ea4|LOG:  connection
authorized: user=postgres database=postgres application_name=099_check-pids.pl
2023-03-15 07:26:14.304 GMT|postgres|postgres|3748|64117316.ea4|LOG:  statement:
SELECT pg_backend_pid()
2023-03-15 07:26:14.305 GMT|postgres|postgres|3748|64117316.ea4|LOG: 
disconnection: session time: 0:00:00.005 user=postgres database=postgres
host=127.0.0.1 port=49902
...
2023-03-15 07:26:25.592 GMT|[unknown]|[unknown]|3748|64117321.ea4|LOG: 
connection received: host=127.0.0.1 port=50407
TRAP: failed Assert("PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED"),
File: "C:\src\postgresql\src\backend\storage\ipc\pmsignal.c", Line: 329, PID: 3748
abort() has been called2023-03-15 07:26:25.608
GMT|[unknown]|[unknown]|3524|64117321.dc4|LOG:  connection received:
host=127.0.0.1 port=50408

The result depends on some OS conditions (it reproduced pretty well
immediately after VM reboot), but it's enough to test the patch proposed.
And I can confirm that the Assert is not observed anymore (with the sleep
added after CloseHandle(childinfo->procHandle)).

Best regards,
Alexander

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2023-03-15 08:12:27 Re: meson: Non-feature feature options
Previous Message Michael Paquier 2023-03-15 07:58:49 Re: psql \watch 2nd argument: iteration count