Re: `pg_ctl init` crashes when run concurrently; semget(2) suspected

From: Gavin Panella <gavinpanella(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: `pg_ctl init` crashes when run concurrently; semget(2) suspected
Date: 2025-08-11 22:45:10
Message-ID: CALL7chntPvKFBN0dE8TF7xOhkBbRGF4R=GTOj7vyZJQZwWGKfw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

With that fix applied to REL_17_5 things are working well. Limiting the
search sounds like an improvement too.

As an experiment I added a log for when semget in
InternalIpcSemaphoreCreate returns -1. When I'm running `pg_ctl init` for
this local build concurrently with `pg_ctl init` from PostgreSQL 15 (or
another version prior to 17), I saw ~8 logged failures when there was
contention. As I increased the concurrency, the maximum number of logged
failures looked to be ~8 times concurrency, roughly. For me, then, running
`pg_ctl init` with a concurrency of 125 would be needed to even begin
exceeding the max retries of 1000 – in the worst case. That sounds high
enough.

Then I thought: I'm only seeing the log from one of those instances, yet
they're all going through the same search for free semaphore sets. That's a
few system calls going to waste. Maybe not important in the big picture,
but it gave me an idea to left shift nextSemaKey in PGReserveSemaphores,
i.e. `nextSemaKey = statbuf.st_ino << 4`, to give each pg_ctl process a few
guaranteed uncontested keys (at least, uncontested between themselves). In
a small test this eliminated contention for semaphore sets due to
concurrency. It is more of an optimisation though, rather than a bug fix.

Gavin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-08-11 22:52:23 Annoying warning in SerializeClientConnectionInfo
Previous Message Jeff Davis 2025-08-11 21:53:58 Re: Adding locks statistics