| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Andrew Dunstan <andrew(at)dunslane(dot)net>, michael(dot)banck(at)credativ(dot)de |
| Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Maybe BF "timedout" failures are the client script's fault? |
| Date: | 2026-01-09 20:41:03 |
| Message-ID: | 2423164.1767991263@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
We've been assuming that all the "timedout" failures on BF member
fruitcrow were due to some wonkiness in the GNU/Hurd platform.
I got suspicious about that though after noticing that there are
a small number of such failures on other animals, eg [1][2][3].
In each case, the failure message claims it waited a good long
time, which is at variance with the actually observed runtime.
For instance [1] says "timed out after 14400 secs", but the
actual total test runtime is only 01:24:28 according to the
summary at the top of the page.
Looking into the buildfarm client, I realized that it's assuming that
"sleep($wait_time)" is sufficient to wait for $wait_time seconds.
However, the Perl docs point out that sleep() can be interrupted by a
signal. So now I'm suspicious that many of these failures are caused
by a stray signal waking up the wait_timeout thread prematurely.
GNU/Hurd might just be more prone to that than other platforms.
I propose the attached patch to the BF client to try to make this
more robust.
regards, tom lane
[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=ovenbird&dt=2025-11-14%2009%3A21%3A05
[2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=conchuela&dt=2025-10-17%2018%3A32%3A07
[3] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=opaleye&dt=2026-01-08%2023%3A07%3A37
| Attachment | Content-Type | Size |
|---|---|---|
| make-bf-timeout-more-robust.patch | text/x-diff | 496 bytes |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David Geier | 2026-01-09 21:02:01 | Re: Reduce build times of pg_trgm GIN indexes |
| Previous Message | Andres Freund | 2026-01-09 19:38:30 | Re: Stack-based tracking of per-node WAL/buffer usage |