Re: pg_plan_advice

From: Lukas Fittl <lukas(at)fittl(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_plan_advice
Date: 2026-03-13 08:38:25
Message-ID: CAP53PkzKeD=t90OfeMsniYrcRe2THQbUx3g6wV17Y=ZtiwmWTQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Robert,

On Thu, Mar 12, 2026 at 10:15 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I've now committed the main patch. I think that I'm not likely to get
> too much more feedback on that in this release cycle, and I judge that
> more people will be sad if it doesn't get shipped this cycle than if
> it does. That might turn out to be catastrophically wrong, but if many
> problem reports quickly begin to emerge, it can always be reverted.

I think its good to get this in, and not do it in the last minute, but
just to be clear on my part, I've reviewed earlier versions, but I
have not reviewed the latest code to the extent I would have liked
before it was committed, due to competing priorities on my end. That
said, its a good forcing function, so let me do my part to help shake
out any bugs that remain.

From a code review perspective, I found some minor issues:

- Identifiers that are named like advice types cause an error (so I
can't name my table "hash_join")
- pgpa_destroy_join_unroller does not free the inner_unrollers values
(which I think goes against the intent of cleaning up the allocations
right away)
- pgpa_parser uses pgpa_yyerror, but that function doesn't use
elog(ERROR) like the main parser does - I think it'd be more
consistent to use YYABORT explicitly, e.g. how we do it in cubeparse
- pgpa_scanner accepts integers with underscores, but incorrectly uses
a simple strtoint on them (which would fail), instead of pg_strtoint32
/ pg_strtoint32_safe
- pgpa_walk_recursively uses "0" for the boolean value being passed to
a recursive call in one case (should be "false")

Attached nocfbot-0001 that addresses these.

From a usability perspective, I do wonder about two things when it
comes to users specifying advice directly (which will happen in
practice, e.g. when debugging plan problems):

1) The lack of being able to target scan advice (e.g. SEQ_SCAN) to a
partition parent is frustrating - I believe we discussed partitioning
earlier in the thread in terms of gathering+applying it, but I do
wonder if we shouldn't at least allow users to specify a partitioned
table/partitioned index, instead of only the children. See attached
nocfbot-0002 for an idea what would be enough, I think.

2) I find the join strategy advice hard to use - pg_hint_plan has
hints (e.g. HashJoin) that allow saying "use a hash join when joining
these two rels in any order", vs pg_plan_advice requires setting a
specific outer rel, which only makes sense when you want to fully
specify every aspect of the plan. I suspect users who directly write
advice will struggle with specifying join strategy advice as it is
right now. We could consider having a different syntax for saying "I
want a hash join when these two rels are joined, but I don't care
about the order", e.g. "USE_HASH_JOIN(a b)". If you think that's
worthwhile I'd be happy to take a stab at a patch.

> I'm still hoping to get some more feedback on the remaining patches,
> which are much smaller and conceptually simpler. While there is no
> time to redesign them at this point in the release cycle, there is
> still the opportunity to fix bugs, or decide that they're too
> half-baked to ship. So here is v20 with just those patches. Of course,
> post-commit review of the main patch is also very welcome.

For v20-0001, from a quick conceptual review:

I find the two separate GUC mechanisms for local backend vs shared
memory a bit confusing as a user (which one should I be using?).
Enabling the shared memory mechanism on a system-wide basis seems like
it'd likely have too high overhead anyway for production systems?
(specifically worried about the advice capturing for each plan, though
I haven't benchmarked it)

I wonder if we shouldn't keep this simpler for now, and e.g. only do
the backend local version to start - we could iterate a bit on
system-wide collection out-of-core, e.g. I'm considering teaching
pg_stat_plans to optionally collect plan advice the first time it sees
a plan ID (plan advice is roughly a superset of what we consider a
plan ID there), and then we could come back to this for PG20.

For v20-0002:

I think we need a form of this in tree, because otherwise
pg_plan_advice breaks as planner logic changes. This of course puts a
burden on other hackers making changes, but I think that's what we're
signing up for with having pg_plan_advice in contrib - and maybe we're
overestimating how often that'll actually be a problem.

To help assess impact, I did a quick test run and looked at three
not-yet-committed patches in the commitfest that affect planner logic
([0], [1] and [2]), to see if they'd require pg_plan_advice changes
(master with v20-0002 applied). Maybe I picked the wrong patches, but
at least with those no pg_plan_advice changes were needed with the
test_plan_advice test enabled.

On the code itself:
- Is there a reason you're setting
"pg_plan_advice.always_store_advice_details" to true, instead of using
pg_plan_advice_request_advice_generation?
- I wonder if we could somehow detect advice not matching? Right now
that'd be silently ignored, i.e. you'd only get a test failure when we
generate the wrong advice that causes a plan change in the regression
tests.

For v20-0003, initial thoughts:

I think getting at least a basic version of this in would be good, as
a server-wide way to set advice for queries can help people get out of
a problem when Postgres behaves badly - and we know from pg_hint_plan
(which has a hint table) that this can be useful even without doing
any kind of parameter sniffing/etc to be smart about different
parameters for the same query.

The name "stash" feels a bit confusing as an end-user facing term.
Maybe something like "pg_apply_advice", or "pg_query_advice" would be
better? (though I kind of wish we could tie it more closely to "plan
advice", but e.g. "pg_plan_advice_apply" feels too lengthy)

---

Thanks,
Lukas

[0]: https://commitfest.postgresql.org/patch/5487/ ("Pull-up subquery
if INNER JOIN-ON contains refs to upper-query")
[1]: https://commitfest.postgresql.org/patch/5738/ ("Improve hash
join's handling of tuples with null join keys")
[2]: https://commitfest.postgresql.org/patch/6315/ ("Enable
partitionwise join for partition keys wrapped by RelabelType")

--
Lukas Fittl

Attachment Content-Type Size
nocfbot-0002-pg_plan_advice-Allow-targeting-scans-by-partition-pa.patch application/x-patch 17.4 KB
nocfbot-0001-pg_plan_advice-Various-fixes.patch application/x-patch 8.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ryo Matsumura (Fujitsu) 2026-03-13 08:38:35 [PATCH] Docs: clarify default values of EXPLAIN BUFFERS and SERIALIZE
Previous Message Chao Li 2026-03-13 08:37:15 Re: [PATCH] Support automatic sequence replication