Re: pg_ctl start may return 0 even if the postmaster has been already started on Windows

From: Bryan Green <dbryan(dot)green(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg_ctl start may return 0 even if the postmaster has been already started on Windows
Date: 2025-10-24 04:02:50
Message-ID: e4025275-0f97-4a3e-b107-a85e60ccf0f7@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/26/2025 3:18 AM, vignesh C wrote:
> On Sat, 20 Jul 2024 at 00:03, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>
>> On Tue, Jul 16, 2024 at 8:04 AM Yasir <yasir(dot)hussain(dot)shah(at)gmail(dot)com> wrote:
>>> Please ignore the above 4 lines in my review. See my comments in blue.
>>
>> OK, so I think it's unclear what the next steps are for this patch.
>>
>> 1. On June 3rd, Michael Paquier said that Tom Lane proposed that,
>> after doing what the patch currently does, we could simplify some
>> other stuff. The details are unclear, and Tom hasn't commented.
>>
>> 2. On June 29th, Noah Misch proposed a platform-independent way of
>> solving the problem.
>>
>> 3. On July 12th, Sutou Kouhei proposed using CreateProcess() to start
>> the postmaster instead of cmd.exe.
>>
>> 4. On July 16th, Yasir Shah said that he tested the patch and found
>> that the problem only exists in v17, not any prior release, which is
>> contrary to my understanding of the situation. He also proposed a
>> minor tweak to the patch itself.
>>
>> So, as I see it, we have three possible ways forward here. First, we
>> could stick with the current patch, possibly with further work as per
>> [1] or adjustments as per [4]. Second, we could abandon the current
>> approach and adopt Noah's proposal in [2]. Third, we could possibly
>> abandon the current approach and adopt Sutou's proposal in [3]. I say
>> "possibly" because I can't personally assess whether this approach is
>> feasible.
>
> Thank you very much, Robert, for summarizing this. If anyone has
> suggestions on which approach might work best, please share them to
> help move this discussion forward.
>
> Regards,
> Vignesh
>
>
Hello everyone,

I've spent the last few days implementing Sutou-san's CreateProcess()
proposal as a proof-of-concept. With all three approaches now available
for comparison, I wanted to share my assessment and recommendation.

The core issue, as we've established, is that cmd.exe intermediation
prevents pg_ctl from getting the real postgres.exe PID, creating race
conditions where we can't distinguish a newly-started postmaster from a
pre-existing one. This causes pg_ctl start to incorrectly report success
when attempting to start an already-running cluster.

Horiguchi-san's process tree walking patch works and is ready to ship.
It has real merit as a minimal-change solution. However, it keeps
cmd.exe and adds process enumeration complexity to work around that
decision. We're fixing symptoms rather than the root cause.

Noah's token proposal is architecturally elegant and
platform-independent. My concern is that it solves a Windows-specific
problem by adding complexity to all platforms. Making other platforms
adopt token-passing infrastructure for Windows's problems feels wrong.
It also requires postmaster changes for what is fundamentally a pg_ctl
issue, raising compatibility questions.

The CreateProcess() approach eliminates cmd.exe entirely and gives us
the actual postgres.exe PID directly. I've implemented this in the
attached patch (about 250 lines of Windows-specific code). The
implementation uses CreateProcessAsUser() with a restricted security
token to handle the case where pg_ctl runs with elevated privileges—this
drops Administrator group membership and dangerous privileges before
launching the postmaster, matching the behavior of the existing
CreateRestrictedProcess() function.

For I/O redirection, we use Windows's native handle-based APIs. When a
log file is specified, we open it with CreateFile() and pass it through
STARTUPINFO. When there's no log file (interactive use and postgres -C
queries), we inherit standard handles. One detail worth noting: we wait
2 seconds for no-log-file launches to distinguish postgres -C (which
exits immediately) from actual server startup. This is long enough to
catch quick exits even on loaded CI systems without impacting test suite
performance.

The implementation passes all regression tests on Windows, including CI
environments with elevated privileges. It handles all the problem
scenarios correctly: normal startup, detecting already-running clusters,
postgres -C queries, pg_upgrade with multiple instances, and concurrent
start attempts.

I prefer the CreateProcess() approach. It fixes the root cause, not the
symptoms. We get the real PID immediately with no process tree walking
or PID verification complexity. It's a Windows-specific solution to a
Windows-specific problem.

If the CreateProcess() approach is deemed unacceptable, I would support
Horiguchi-san's process tree walking patch as a pragmatic fallback. It
fixes the bug and is tested. However, I believe we'd be choosing a
workaround over a proper fix and adding maintenance burden we don't need.

I'm opposed to the token approach for the reasons stated above—it's
architecturally appealing but touches all platforms for a Windows-only
problem.

With this implementation complete, we now have all three options on the
table for evaluation. I believe the CreateProcess() approach is
technically sound and the right path forward, but I'm open to discussion
and happy to address any concerns about the implementation.

Regardless of which way we choose to go, I'll happily help review and
test the solution.

Best regards,
BG

Attachment Content-Type Size
0001-Fix-pg_ctl-on-Windows-to-reliably-detect-already-run.patch text/plain 21.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matt Smith (matts3) 2025-10-24 04:18:42 Re: Meson install warnings when running postgres build from a sandbox
Previous Message Zhijie Hou (Fujitsu) 2025-10-24 03:25:49 RE: Logical Replication of sequences