| From: | Bryan Green <dbryan(dot)green(at)gmail(dot)com> |
|---|---|
| To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | [PATCH] O_CLOEXEC not honored on Windows - handle inheritance chain |
| Date: | 2025-10-31 17:16:04 |
| Message-ID: | e2b16375-7430-4053-bda3-5d2194ff1880@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello,
While reviewing the pg_ctl CreateProcess patch [1], I started looking
into handle inheritance on Windows and found something that puzzles me.
I think there's an issue with O_CLOEXEC, but wanted to walk through my
reasoning to make sure I'm not missing something obvious.
The issue appears to be that O_CLOEXEC doesn't actually do anything on
Windows. When PostgreSQL opens a WAL file in xlog.c, it specifies
O_CLOEXEC in the OpenTransientFile() call, expecting the file handle to
be non-inheritable. However, O_CLOEXEC is defined as 0 in win32_port.h,
and our pgwin32_open() function in src/port/open.c unconditionally sets
sa.bInheritHandle = TRUE at line 80. So the flag is simply ignored, and
all file handles are created inheritable.
Now, for a handle to actually be inherited by a child process on
Windows, two conditions must both be true. First, the handle itself must
have been created with bInheritHandle = TRUE (which we do for
everything). Second, the parent must call CreateProcess with
bInheritHandles = TRUE. So the question becomes: does that second
condition ever happen?
It does. When archive_command runs, PostgreSQL calls pgwin32_system(),
which wraps the command in quotes and passes it to the Microsoft C
runtime's system() function. That function needs to make stdio work for
the child process, so it has no choice but to call CreateProcess with
bInheritHandles = TRUE. This means cmd.exe inherits all our file
handles, including any open database or WAL files, and cmd.exe passes
them along to copy.exe or whatever command is being run.
I wrote a test program that links against libpgport.a to verify this
behavior. It opens files with and without O_CLOEXEC using the actual
pgwin32_open() function, then spawns a child process with
bInheritHandles = TRUE (mimicking what system() does) and tries to
access the handles from the child. Both files were accessible from the
child process regardless of whether O_CLOEXEC was specified. The flag
has no effect.
Commit 1da569ca1f (March 2023) added O_CLOEXEC to many call sites
throughout the backend with a comment saying "Our open() replacement
does not create inheritable handles, so it is safe to ignore
O_CLOEXEC." But that doesn't appear to match what the code actually
does. I'm wondering if I've misunderstood something about how handle
inheritance works on Windows, or if the comment was based on a
misunderstanding of the code path.
The practical impact of this seems low. Child processes spawned by
archive_command or COPY PROGRAM operate on file paths passed as
arguments, not on inherited file descriptors, so they don't actually
make use of the handles even though they have them. Even if a child
wanted to use an inherited handle, it would need to somehow learn the
numeric handle value, which isn't passed through our IPC mechanisms.
And Windows users probably employ archive_command less frequently than
Unix users anyway.
Nonetheless, if this is really a bug, it's a correctness issue. It
violates the documented semantics of O_CLOEXEC, contradicts what our
own comments claim, and means PostgreSQL behaves differently on Windows
than on Unix. It also creates unnecessary handle leaks where files
can't be deleted or renamed while child processes are running. While
reviewing my pg_ctl patch, I realized it would make handle inheritance
more explicit and direct, which made me want to understand whether
O_CLOEXEC actually works.
The fix would be straightforward if this is indeed wrong. Define
O_CLOEXEC to a non-zero value like 0x80000 (in the private use range
for open() flags), and then honor it in pgwin32_open() by setting
sa.bInheritHandle based on whether the flag is present:
sa.bInheritHandle = (fileFlags & O_CLOEXEC) ? FALSE : TRUE;
We also have to add the O_CLOEXEC to the assertion in open.c that
validates that fileFlags only contains known flags.
I've tested this change locally and it works as expected. Files opened
with O_CLOEXEC are not accessible from child processes, while files
opened without it are accessible.
So my questions are: Am I correct that both conditions for handle
inheritance are met, meaning handles really are being inherited by
archive_command children? Is there something in Windows that prevents
inheritance that I don't know about? If this is a real bug, would it
make sense to backpatch to v16 where O_CLOEXEC was added? I'm happy to
provide my test code or do additional testing if that would help.
For reference, the Microsoft documentation on handle inheritance is at:
https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createprocessa
And I confirmed through research that UCRT's system() does use
bInheritHandles = TRUE, which makes sense since it needs stdio to work.
Bryan Green
| Attachment | Content-Type | Size |
|---|---|---|
| 0001-Fix-O_CLOEXEC-flag-handling-in-Windows-port.patch | text/plain | 3.8 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Heikki Linnakangas | 2025-10-31 17:17:35 | Re: [PATCH] Speed up of vac_update_datfrozenxid. |
| Previous Message | Heikki Linnakangas | 2025-10-31 17:11:01 | Re: Calling PGReserveSemaphores() from CreateOrAttachShmemStructs |