IPC::Run::time[r|out] vs our TAP tests

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: IPC::Run::time[r|out] vs our TAP tests
Date: 2024-04-04 21:24:05
Message-ID: 1100715.1712265845@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

TIL that IPC::Run::timer is not the same as IPC::Run::timeout.
With a timer object you have to check $timer->is_expired to see
if the timeout has elapsed, but with a timeout object you don't
because it will throw a Perl exception upon timing out, probably
killing your test program.

It appears that a good chunk of our TAP codebase has not read this
memo, because I see plenty of places that are checking is_expired
in the naive belief that they'll still have control after a timeout
has fired.

The particular thing that started me down this road was wondering
why we are getting no useful failure details from buildfarm member
tanager's struggles with the tab-completion test case added by commit
927332b95 [1]. Apparently it's not seeing a match to what it expects
so it eventually times out, but all we get in the log is

[03:03:42.595](0.002s) ok 82 - complete an interpolated psql variable name
[03:03:42.597](0.002s) ok 83 - \\r works
IPC::Run: timeout on timer #1 at /usr/share/perl5/IPC/Run.pm line 2944.
# Postmaster PID for node "main" is 17308
### Stopping node "main" using mode immediate

We would have learned something useful if control had returned to
pump_until, or even better 010_tab_completion.pl's check_completion,
but it doesn't.

A minimum fix that seems to make this work better is as attached,
but I feel like somebody ought to examine all the IPC::Run::timer
and IPC::Run::timeout calls and see which ones are mistaken.
It's a little scary to convert a timeout to a timer because of
the hazard that someplace that would need to be checking for
is_expired isn't.

Also, the debug printout code at the bottom of check_completion
is quite useless, because control can never reach it since
BackgroundPsql::query_until will "die" on failure. I think that
that code worked when written, and I'm suspicious that 664d75753
broke it, but I've not dug deeply into the history.

regards, tom lane

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tanager&dt=2024-04-04%2016%3A56%3A14

Attachment Content-Type Size
use-timer-not-timeout-in-BackgroundPsql.patch text/x-diff 1.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-04-04 21:25:47 Re: Security lessons from liblzma
Previous Message Kirill Reshke 2024-04-04 21:08:37 Re: CSN snapshots in hot standby