| From: | Andrei Lepikhov <lepihov(at)gmail(dot)com> |
|---|---|
| To: | Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: allowing extensions to control planner behavior |
| Date: | 2024-08-26 18:00:54 |
| Message-ID: | aaf4250a-77db-43e3-9720-f60e2e4f5eba@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 26/8/2024 18:32, Robert Haas wrote:
> I'm somewhat expecting to be flamed to a well-done crisp for saying
> this, but I think we need better ways for extensions to control the
> behavior of PostgreSQL's query planner. I know of two major reasons
It is the change I have been waiting for a long time. Remember how many
kludge codes in pg_hint_plan, aqo, citus, timescale, etc., are written
for only the reason of a small number of hooks - I guess many other
people could cheer such work.
> why somebody might want to do this. First, you might want to do
> something like what pg_hint_plan does, where it essentially implements
> Oracle-style hints that can be either inline or stored in a side table
> and automatically applied to queries.[1] In addition to supporting
> Oracle-style hints, it also supports some other kinds of hints so that
> you can, for example, try to fix broken cardinality estimates. Second,
My personal most wanted list:
- Selectivity list estimation hook
- Groups number estimation hook
- hooks on memory estimations, involving work_mem
- add_path() hook
- Hook on final RelOptInfo pathlist
- a custom list of nodes in RelOptinfo, PlannerStmt, Plan and Query
structures
- Extensibility of extended and plain statistics
- Hook on portal error processing
- Canonicalise expressions hook
> you might want to convince the planner to keep producing the same kind
> of plan that it produced previously. I believe this is what Amazon's
> query plan management feature[2] does, although since it is closed
> source and I don't work at Amazon maybe it's actually implemented
> completely differently. Regardless of what Amazon did in this case,
> plan stability is a feature people want. Just trying to keep using the
> same plan data structure forever doesn't seem like a good strategy,
> because for example it would be fragile in the case of any DDL
> changes, like dropping and recreating an index, or dropping or adding
As a designer of plan freezing feature [1] I can say it utilises
plancache and, being under its invalidation callbacks it doesn't afraid
DDL or any other stuff altering database objects.
> Unfortunately, the part about the hook having the freedom to delete
> paths isn't really true. Perhaps technically you can delete a path
> that you don't want to be chosen, but any paths that were dominated by
> the path you deleted have already been thrown away and it's too late
> to get them back. You can modify paths if you don't want to change
> their costs, but if you change their costs then you have the same
> problem: the contents of the pathlist at the time that you see it are
> determined by the costs that each path had when it was initially
> added, and it's really too late to editorialize on that. So all you
> can really do here in practice is add new paths.
From my standpoint, it is enough to export routines creating paths and
calculating costs.
> set_join_pathlist_hook, which applies to joinrels, is similarly
> limited. appendrels don't even have an equivalent of this hook.
>
> So, how could we do better?
>
> I think there are two basic approaches that are possible here. If
> someone sees a third option, let me know. First, we could allow users
> to hook add_path() and add_partial_path(). That certainly provides the
> flexibility on paper to accept or reject whatever paths you do or do
+1
> The attached patch, briefly mentioned above, essentially converts the
> enable_* GUCs into RelOptInfo properties where the defaults are set by
> the corresponding GUCs. The idea is that a hook could then change this
> on a per-RelOptInfo basis before path generation happens. For
IMO, it is better not to switch on/off algorithms, but allow extensions
to change their cost multipliers, modifying costs balance. 10E9 looks
like a disable, but multiplier == 10 for a cost node just provide more
freedom for hashing strategies.
[1] https://postgrespro.com/docs/enterprise/16/sr-plan
--
regards, Andrei Lepikhov
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ayush Vatsa | 2024-08-26 18:09:35 | Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch |
| Previous Message | Heikki Linnakangas | 2024-08-26 17:54:13 | Re: thread-safety: getpwuid_r() |