| From: | Bryan Green <dbryan(dot)green(at)gmail(dot)com> |
|---|---|
| To: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
| Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: [PATCH] O_CLOEXEC not honored on Windows - handle inheritance chain |
| Date: | 2026-06-20 17:48:05 |
| Message-ID: | 9425ab17-dc61-4809-86e5-cb0640ba49d3@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 6/20/2026 2:00 AM, Alexander Lakhin wrote:
> Hello Bryan,
>
> 13.12.2025 04:44, Tom Lane wrote:
>> Thomas Munro<thomas(dot)munro(at)gmail(dot)com> writes:
>>> Hearing nothing, pushed.
>> fairywren is unimpressed:
>
> Now that fairywren is pretty satisfied with the test, it has produced an
> interesting failure [1]:
> 208/222 postgresql:test_cloexec /
> test_cloexec/001_cloexec ERROR 4.09s exit status 1
>
> regress_log_001_cloexec
> [14:36:36.820](1.047s) 1..1
> [14:36:36.821](0.001s) # Using test program: C:\\tools\\xmsys64\\home\
> \pgrunner\\bf\\root\\REL_16_STABLE\\pgsql.build\\tmp_install\\tools\
> \xmsys64\\home\\pgrunner\\bf\\root\\REL_16_STABLE\\inst\\bin\
> \test_cloexec.exe
> [14:36:37.445](0.624s) # Test program output:
> [14:36:37.446](0.001s) # Child: Received HANDLE1=00000000000000B0
> (should fail - O_CLOEXEC)
> # Child: Received HANDLE2=00000000000000C0 (should work - no O_CLOEXEC)
> # Child: Successfully wrote to HANDLE1
> # Child: Successfully wrote to HANDLE2
> # Child: HANDLE1 (O_CLOEXEC): ACCESSIBLE (BAD!)
> # Child: HANDLE2 (no O_CLOEXEC): ACCESSIBLE (GOOD!)
> # Child: Test FAILED - O_CLOEXEC not working correctly
> # Parent: Opening test files...
> # Parent: fd1=3 (O_CLOEXEC) -> HANDLE=00000000000000B0
> # Parent: fd2=4 (no O_CLOEXEC) -> HANDLE=00000000000000C0
> # Parent: Spawning child process...
> # Parent: Command line: "C:\\tools\\xmsys64\\home\\pgrunner\\bf\\root\
> \REL_16_STABLE\\pgsql.build\\tmp_install\\tools\\xmsys64\\home\
> \pgrunner\\bf\\root\\REL_16_STABLE\\inst\\bin\\test_cloexec.exe"
> 00000000000000B0 00000000000000C0
> # Parent: Waiting for child process...
> # Parent: Child exit code: 1
> # Parent: FAILURE - O_CLOEXEC not working correctly
> [14:36:37.447](0.002s) not ok 1 - O_CLOEXEC prevents handle inheritance
> [14:36:37.448](0.001s) # Failed test 'O_CLOEXEC prevents handle
> inheritance'
> # at C:/tools/xmsys64/home/pgrunner/bf/root/REL_16_STABLE/pgsql/src/
> test/modules/test_cloexec/t/001_cloexec.pl line 57.
> [14:36:37.449](0.001s) # Looks like you failed 1 test of 1.
>
> I've managed to reproduce it locally with:
> --- a/src/test/modules/test_cloexec/t/001_cloexec.pl
> +++ b/src/test/modules/test_cloexec/t/001_cloexec.pl
> @@ -19,3 +19,3 @@ if (!$PostgreSQL::Test::Utils::windows_os)
>
> -plan tests => 1;
> +plan tests => 1000;
>
> @@ -44,2 +44,4 @@ note("Using test program: $test_prog");
>
> +for (my $i = 1; $i <= 1000; $i++)
> +{
> my ($stdout, $stderr);
> @@ -58,3 +60,3 @@ ok($result && $stdout =~ /SUCCESS.*O_CLOEXEC behavior
> verified/s,
> "O_CLOEXEC prevents handle inheritance");
> -
> +}
> done_testing();
>
> and with the test cloned:
> perl -i.bak -ne "print unless /test_cloexec_/" "src/test/modules/
> meson.build" && python3 -c "for i in range(1,50+1):import os;import
> shutil;sd=\"src/test/modules/test_cloexec/\";td=f\"src/test/modules/
> test_cloexec_{i}\";shutil.rmtree(td,ignore_errors=1);shutil.copytree(sd,
> td);assert(os.system(f'perl -i.bak -pe \"s#(subdir.
> \'test_cloexec\'.)#subdir(\'test_cloexec_{i}\'){chr(92)}n{chr(92)}1#\"
> \"src/test/modules/meson.build\"')==0);assert(os.system(f'perl -i.bak -
> pe \"s#: \'test_cloexec\',#: \'test_cloexec_{i}\',#\" \"{td}/
> meson.build\"')==0)"
>
> meson test test_cloexec_*/001_cloexec --num-processes 50
> failed for me as below:
> Ok: 19
> Fail: 31
>
> # grep 'not ok' meson-logs/testlog.txt
> not ok 541 - O_CLOEXEC prevents handle inheritance
> not ok 303 - O_CLOEXEC prevents handle inheritance
> not ok 798 - O_CLOEXEC prevents handle inheritance
> not ok 901 - O_CLOEXEC prevents handle inheritance
> not ok 634 - O_CLOEXEC prevents handle inheritance
> not ok 640 - O_CLOEXEC prevents handle inheritance
> not ok 770 - O_CLOEXEC prevents handle inheritance
> not ok 938 - O_CLOEXEC prevents handle inheritance
> not ok 420 - O_CLOEXEC prevents handle inheritance
> not ok 674 - O_CLOEXEC prevents handle inheritance
> not ok 114 - O_CLOEXEC prevents handle inheritance
> not ok 649 - O_CLOEXEC prevents handle inheritance
> not ok 51 - O_CLOEXEC prevents handle inheritance
> not ok 692 - O_CLOEXEC prevents handle inheritance
> not ok 874 - O_CLOEXEC prevents handle inheritance
> not ok 955 - O_CLOEXEC prevents handle inheritance
> not ok 690 - O_CLOEXEC prevents handle inheritance
> not ok 33 - O_CLOEXEC prevents handle inheritance
> not ok 116 - O_CLOEXEC prevents handle inheritance
> not ok 398 - O_CLOEXEC prevents handle inheritance
> not ok 773 - O_CLOEXEC prevents handle inheritance
> not ok 2 - O_CLOEXEC prevents handle inheritance
> not ok 253 - O_CLOEXEC prevents handle inheritance
> not ok 10 - O_CLOEXEC prevents handle inheritance
> not ok 902 - O_CLOEXEC prevents handle inheritance
> not ok 848 - O_CLOEXEC prevents handle inheritance
> not ok 832 - O_CLOEXEC prevents handle inheritance
> not ok 38 - O_CLOEXEC prevents handle inheritance
> not ok 986 - O_CLOEXEC prevents handle inheritance
> not ok 688 - O_CLOEXEC prevents handle inheritance
> not ok 872 - O_CLOEXEC prevents handle inheritance
> not ok 582 - O_CLOEXEC prevents handle inheritance
> not ok 816 - O_CLOEXEC prevents handle inheritance
> not ok 277 - O_CLOEXEC prevents handle inheritance
> not ok 570 - O_CLOEXEC prevents handle inheritance
> not ok 691 - O_CLOEXEC prevents handle inheritance
> not ok 871 - O_CLOEXEC prevents handle inheritance
> not ok 770 - O_CLOEXEC prevents handle inheritance
> not ok 108 - O_CLOEXEC prevents handle inheritance
> not ok 610 - O_CLOEXEC prevents handle inheritance
> not ok 866 - O_CLOEXEC prevents handle inheritance
> not ok 909 - O_CLOEXEC prevents handle inheritance
> not ok 276 - O_CLOEXEC prevents handle inheritance
> not ok 190 - O_CLOEXEC prevents handle inheritance
>
> It looks like HANDLE1 passed from parent to child can accidentally point
> to a child slot occupied by some other writable object, not inherited from
> parent.
>
> [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?
> nm=fairywren&dt=2026-06-19%2013%3A28%3A44
>
> Best regards,
> Alexander
Yeah, the test script has problems. Passing raw numeric handle values
is problematic. When using CreateProcess with bInheritHandles=TRUE, the
kernel duplicates "inheritable" handles into the child at the same
numeric value. So, HANDLE2 works reliably since 0xc0 is he same file
object in both processes.
The non-inherited handle is simply unmanaged in the child. O_CLOEXEC
means that 0xB0 will not be duplicated into the child process. The
problem is that we have no promise that some other kernel object handle
is not at that offset in the child. The child's own startup populates
its handle table and windows reuses low handle values aggressively.
Assuming, the handle was marked as not inheritable in the parent but the
child's handle table has an entry at that same offset for another kernel
object: try_write_to_handle() does a writefile against another kernel
object and succeeds. This is a false failure. This test can't tell the
difference between the actual inherited file and another kernel object
that just happened to have this offset in the handle table assigned to it.
Running in parallel will hit this problem often.
Also, by writing into whatever random kernel object that is referenced
by this offset in the handle table in the child could cause real side
effects when it misfires.
We have a couple of options: 1) use GetHandleInformation for each
handle and test the HANDLE_FLAG_INHERIT bit. This would be a completely
deterministic test and no spawn shenanigans. 2) If we want to keep an
end to end spawn test-- we pass the file paths to the child as well as
the expected handles and use GetFinalPathNameByHandle() to test that the
passed in handles resolve to the correct paths in the child.
The end to end test just seems to be testing that CreateProcess honors
the O_CLOEXEC, which is testing the OS behavior. The
GetHandleInformation(handle1/handle2) check for HANDLE_FLAG_INHERIT is
simpler and would show that the handle has been marked as inheritable or
not. We should trust the kernel to do what it says it does and just
test that our code path for O_CLOEXEC actually creates the handle as
either inheritable or not. This is what GetHandleInformation can tell
us and it removes the need for the spawn.
I am busy with an unfortunate family emergency or I would just code this
as a patch now given agreement on path, but I can get to this by the end
of the week if no one else cares to implement.
--
Bryan Green
EDB: https://www.enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Zakariyah Ali | 2026-06-20 17:49:06 | [PATCH v2] Standardize log polling using wait_for_log() across recovery tests |
| Previous Message | ZizhuanLiu X-MAN | 2026-06-20 15:48:55 | Re: BUG with accessing to temporary tables of other sessions still exists |