Re: pg_plan_advice

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alexander Lakhin <exclusion(at)gmail(dot)com>, Lukas Fittl <lukas(at)fittl(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_plan_advice
Date: 2026-04-02 23:43:33
Message-ID: 3683430.1775173413@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

My animal sifaka just showed an all-new type of test_plan_advice
failure [1]:

diff -U3 /Users/buildfarm/bf-data/HEAD/pgsql.build/src/test/regress/expected/limit.out /Users/buildfarm/bf-data/HEAD/pgsql.build/src/test/modules/test_plan_advice/tmp_check/results/limit.out
--- /Users/buildfarm/bf-data/HEAD/pgsql.build/src/test/regress/expected/limit.out 2026-04-02 12:35:13
+++ /Users/buildfarm/bf-data/HEAD/pgsql.build/src/test/modules/test_plan_advice/tmp_check/results/limit.out 2026-04-02 12:49:59
@@ -5,6 +5,8 @@
SELECT ''::text AS two, unique1, unique2, stringu1
FROM onek WHERE unique1 > 50
ORDER BY unique1 LIMIT 2;
+WARNING: supplied plan advice was not enforced
+DETAIL: advice INDEX_SCAN(onek public.onek_unique1) feedback is "matched, inapplicable, failed"
two | unique1 | unique2 | stringu1
-----+---------+---------+----------
| 51 | 76 | ZBAAAA
=== EOF ===
[12:50:02.062](11.620s) not ok 1 - regression tests pass

This is unlike the other cases we've been looking at: no sub-selects,
no GEQO, not even any joins. There is pretty much only one plausible
plan for that query, so how did it fail?

After looking around, I think the likely explanation is that the
concurrently-run alter_table.sql test feels free to mess with the set
of indexes on onek. It doesn't drop onek_unique1, but it does
momentarily rename it:

ALTER INDEX onek_unique1 RENAME TO attmp_onek_unique1;
ALTER INDEX attmp_onek_unique1 RENAME TO onek_unique1;

I've not looked closely at pg_plan_advice, but if it matches indexes
by name then it seems there's a window here where the advice would
fail to apply. Also, further down we find

ALTER TABLE onek ADD CONSTRAINT onek_unique1_constraint UNIQUE (unique1);
ALTER INDEX onek_unique1_constraint RENAME TO onek_unique1_constraint_foo;
ALTER TABLE onek DROP CONSTRAINT onek_unique1_constraint_foo;

which means there's a window there where there are two plausible
indexes to choose. Will test_plan_advice cope if the transient one
is chosen?

We could imagine dodging this problem either by having alter_table.sql
test some purpose-built table instead of a shared one, or by having it
do these hacks inside transactions so that other sessions can't see
the intermediate states. But I'm quite resistant to that answer,
because I think part of the point here is to ensure that concurrent
DDL doesn't misbehave. (Which it doesn't: these test fragments have
been there since 2018 and 2012 respectively, and not caused issues
AFAIK.) Preventing our tests from exercising concurrent DDL in order
to satisfy test_plan_advice is not a good plan IMO. There's also the
prospect of a long tail of whack-a-mole as we locate other places in
the tests where this sort of thing happens occasionally.

So I'm not sure what to do here, but we have a problem.

regards, tom lane

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2026-04-02%2016%3A35%3A13

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2026-04-02 23:49:21 Re: pg_waldump: support decoding of WAL inside tarfile
Previous Message Masahiko Sawada 2026-04-02 23:30:42 Re: POC: Parallel processing of indexes in autovacuum