Re: GNU/Hurd portability patches

From: Michael Banck <mbanck(at)gmx(dot)net>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Alexander Lakhin <exclusion(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: GNU/Hurd portability patches
Date: 2025-09-24 10:45:59
Message-ID: 68d3cbe8.5d0a0220.65f10.da5f@mx.google.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 24, 2025 at 09:41:19AM +0200, Michael Banck wrote:
> On Wed, Sep 24, 2025 at 08:31:27AM +0900, Michael Paquier wrote:
> > So yes, this random factor would be annoying in the buildfarm.
>
> How much timer resolution do we require from the system? GNU Mach seems
> to (at least try to) guarantee that the timer won't go backwards, but it
> does not guarantee (currently) that two consecutive clock_gettime()
> calls will return something different in all cases.

This is the pg_test_timing output on my hurd-i386 VM with
pg_test_timing from HEAD:

Average loop time including overhead: 13866,64 ns
Histogram of timing durations:
<= ns % of total running % count
0 0,0510 0,0510 122
1 0,0000 0,0510 0
3 0,0000 0,0510 0
7 0,0000 0,0510 0
15 0,0000 0,0510 0
31 0,0000 0,0510 0
63 0,0000 0,0510 0
127 0,0000 0,0510 0
255 0,0000 0,0510 0
511 0,0000 0,0510 0
1023 0,0004 0,0514 1
2047 0,0000 0,0514 0
4095 98,9320 98,9834 236681
8191 0,8845 99,8679 2116
16383 0,0393 99,9072 94
32767 0,0343 99,9415 82
[...]

Observed timing durations up to 99,9900%:
ns % of total running % count
0 0,0510 0,0510 122
729 0,0004 0,0514 1
3519 0,0004 0,0518 1
3630 0,0130 0,0648 31
3640 0,1651 0,2299 395
3650 0,7449 0,9748 1782
3660 2,3395 3,3143 5597

Clearly those aren't very precise (running Debian 13 GNU/Linux on the
same host in the same qemu/kvm fashion, I get an average loop time
including overhead of around 30ns), but I assumed that the 122 0ns
entries would be the problem; however Hannu reported back in 2024 that
he saw something similar on his Macbook Air M1:
https://www.postgresql.org/message-id/CAMT0RQSbzeJN+nPo_QXib-P62rgez=dJxoaTURcN1FYPoLpQPg@mail.gmail.com

|Per loop time including overhead: 21.54 ns
|Histogram of timing durations:
| <= ns % of total running % count
| 0 49.1655 49.1655 68481688
| 1 0.0000 49.1655 0
| 3 0.0000 49.1655 0
| 7 0.0000 49.1655 0
| 15 0.0000 49.1655 0
| 31 0.0000 49.1655 0
| 63 50.6890 99.8545 70603742
| 127 0.1432 99.9976 199411
| 255 0.0015 99.9991 2065

I wonder what is going on here, was that a fluke or is that not related
to the stats isolation test failure after all? Anybody else tried the
updated pg_test_timing on Apple hardware and could possibly run the tt.c
test case from Alexander?

btw, the stats test failed in a similar way on hamerkop (Windows Server
2016) once, 35 days ago:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamerkop&dt=2025-08-19%2013%3A56%3A17

Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shlok Kyal 2025-09-24 11:09:33 Re: How can end users know the cause of LR slot sync delays?
Previous Message Tatsuo Ishii 2025-09-24 10:35:23 Re: Row pattern recognition