Fix a race condition in ConditionVariableTimedSleep()

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Fix a race condition in ConditionVariableTimedSleep()
Date: 2025-05-05 07:52:46
Message-ID: aBhuTqNhMN3prcqe@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

While working on wait event related stuff I observed a failed assertion:

"
TRAP: failed Assert("node->next == 0 && node->prev == 0"), File: "../../../../src/include/storage/proclist.h", Line: 91
"

during pg_regress/database.

To reproduce, add an ereport(LOG,..) or CHECK_FOR_INTERRUPTS() or whatever
would trigger ConditionVariableBroadcast() in pgstat_report_wait_end():

And launch:

postgres=# CREATE TABLESPACE regress_tblspace LOCATION '/home/postgres/bdttbs';
CREATE TABLESPACE
postgres=# create database regression_bdt9;
CREATE DATABASE
postgres=# ALTER DATABASE regression_bdt9 SET TABLESPACE regress_tblspace;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

The issue is that in ConditionVariableTimedSleep() during the WaitLatch() we’ll
also trigger ProcessInterrupts() that will call ConditionVariableCancelSleep()
that will:

- remove the process from the wait list
- and also set cv_sleep_target to NULL.

Then in ConditionVariableTimedSleep(), we’ll go through:


/*
* If this process has been taken out of the wait list, then we know
* that it has been signaled by ConditionVariableSignal (or
* ConditionVariableBroadcast), so we should return to the caller. But
* that doesn't guarantee that the exit condition is met, only that we
* ought to check it. So we must put the process back into the wait
* list, to ensure we don't miss any additional wakeup occurring while
* the caller checks its exit condition. We can take ourselves out of
* the wait list only when the caller calls
* ConditionVariableCancelSleep.
*
* If we're still in the wait list, then the latch must have been set
* by something other than ConditionVariableSignal; though we don't
* guarantee not to return spuriously, we'll avoid this obvious case.
*/
SpinLockAcquire(lock: &cv->mutex);
if (!proclist_contains(&cv->wakeup, MyProcNumber, cvWaitLink))
{
done = true;
proclist_push_tail(&cv->wakeup, MyProcNumber, cvWaitLink);
}
SpinLockRelease(&cv->mutex);

Means we re-add the process in the wait list. The issue is that cv_sleep_target
is still NULL so that we’ll exit ConditionVariableTimedSleep() due to:

"
if (cv != cv_sleep_target)
done = true;

/* We were signaled, so return */
if (done)
return false;
"

Later we’ll want to add our process to a wait list, calling ConditionVariablePrepareToSleep()
that will not call ConditionVariableCancelSleep() due to:


if (cv_sleep_target != NULL)
ConditionVariableCancelSleep();

As cv_sleep_target is still NULL.
Finally calling proclist_push_tail() that will trigger the failed assertion.

The proposed fix attached is done in ConditionVariableTimedSleep() as this is the
place that introduces the race condition. It re-assigns cv_sleep_target to cv
and then ensures that cv_sleep_target accurately describes which condition
variable we’re prepared to wait on.

Looking forward to your feedback,

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v1-0001-Fix-a-race-condition-in-ConditionVariableTimedSle.patch text/x-diff 2.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mikhail Kharitonov 2025-05-05 08:18:18 [PATCH] Fix replica identity mismatch for partitioned tables with publish_via_partition_root
Previous Message Laurenz Albe 2025-05-05 07:31:00 Re: PATCH: pg_dump to support "on conflict do update"