| From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Non-robust plpgsql_trap test |
| Date: | 2026-04-21 14:20:22 |
| Message-ID: | 85af1521-0909-42ea-a0f9-f755919c6cbe@dunslane.net |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 2026-04-21 Tu 9:54 AM, Tom Lane wrote:
> I've noticed a few buildfarm failures similar to [1]:
>
> # diff -U3 /repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/expected/plpgsql_trap.out /repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/results/plpgsql_trap.out
> # --- /repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/expected/plpgsql_trap.out 2026-04-21 04:22:01.030204342 -0300
> # +++ /repos/client-code-REL_19_1/HEAD/pgsql.build/src/pl/plpgsql/src/results/plpgsql_trap.out 2026-04-21 04:29:54.795187855 -0300
> # @@ -155,7 +155,7 @@
> # begin;
> # set statement_timeout to 1000;
> # select trap_timeout();
> # -NOTICE: nyeah nyeah, can't stop me
> # +NOTICE: caught others?
> # ERROR: end of function
> # CONTEXT: PL/pgSQL function trap_timeout() line 15 at RAISE
> # rollback;
> not ok 11 - plpgsql_trap 502 ms
>
> which is coming from unexpected behavior of this bit of plpgsql
> code:
>
> begin
> -- we assume this will take longer than 1 second:
> select count(*) into x from generate_series(1, 1_000_000_000_000);
> exception
> when others then
> raise notice 'caught others?';
> when query_canceled then
> raise notice 'nyeah nyeah, can''t stop me';
> end;
>
> The light bulb went on when I noticed a nearby failure from the same
> machine that was clearly traceable to out-of-disk-space. What
> happened here, I have no doubt, was that the "from generate_series"
> bit tried to make a large temporary file, ran out of space, and threw
> an appropriate error, causing us to take the "wrong" exception
> handler.
>
> Proposal:
>
> 1. Replace that query with something not so resource-intensive.
> I'm not really sure why we didn't just use "perform pg_sleep(10)".
> Maybe it didn't exist or didn't reliably wait 10 seconds at the
> time, but it does now.
>
> 2. Adjust the "when others" handler to report the actual error,
> to make this sort of thing easier to debug next time.
>
> regards, tom lane
>
> [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=caiman&dt=2026-04-21%2007%3A21%3A57
Sounds good.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Previous Message | Tom Lane | 2026-04-21 13:54:54 | Non-robust plpgsql_trap test |