| From: | Tomas Vondra <tomas(at)vondra(dot)me> |
|---|---|
| To: | John Naylor <johncnaylorls(at)gmail(dot)com> |
| Cc: | Chengpeng Yan <chengpeng_yan(at)outlook(dot)com>, lakshmi <lakshmigcdac(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com> |
| Subject: | Re: Add a greedy join search algorithm to handle large join problems |
| Date: | 2026-05-14 08:49:09 |
| Message-ID: | 18810207-0580-471c-b16d-6e553037d1d7@vondra.me |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 5/14/26 07:59, John Naylor wrote:
> On Wed, May 13, 2026 at 6:54 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>>
>> On 5/13/26 06:34, John Naylor wrote:
>>> On Tue, May 12, 2026 at 8:42 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
>>>> To got to the GEQO/GOO code, people have to adjust the limits, so that
>>>> (join_collapse_limit >= geqo_threshold). AFAIK almost no one does that,
>>>> so most join problems are smaller that geqo_threshold and so handled by
>>>> regular DP (for smaller subproblems).
>>>>
>>>> But let's say someone adjusts the GUCs, gets to GOO, and it handles a
>>>> prefix of the problem using the DP approach. How is that different from
>>>> keeping the (join_collapse_limit < geqo_threshold) and not even getting
>>>> to GOO? Why not to just leave join_collapse_limit to a low value?
>>>
>>> One flavor of the second idea above (found in the literature) is to
>>> start with DP, and after some new GUC limit (under the hood: # of
>>> times calling make_join_rel), pick one of the incomplete subproblems
>>> and finish with a heuristic search. It switches strategy on the fly,
>>> rather than choosing upfront. That's the difference from now, although
>>> I'm not sure offhand how join_collase_limit chooses its subproblems
>>> out of the bigger problem.
>>>
>>
>> Not sure, for two reasons.
>>
>> How often do we expect this to be used? AFAIK GEQO is not used very
>> often in practice, because geqo_threshold is so much higher than
>> join_collapse_limit. So even huge joins get divided into problems
>> handled by DP. We might invest a lot of time and complexity into
>> something that gets used extremely rarely ...
>
> That's a risk to keep in mind.
>
> In cases where planning time is a substantial part of the runtime
> (like in your thread on OLTP star joins), a user could lower the
> threshold for greedy search, since it's fast. That's not an option
> currently because GEQO is slow.
>
Perhaps. But there's also the discussion in [1], where Tom asked if
maybe it's time to update the join_collapse_limit/geqo_threshold
defaults, and Andrei responds join_collapse_limit=40/geqo_threshold=14
worked OK for the case he was investigating.
Those values are not meant to be the new defaults, but chances are it's
time to update the defaults in some way. And that may alter the chance
of actually triggering GEQO.
[1] https://www.postgresql.org/message-id/3525339.1770668216%40sss.pgh.pa.us
> With a dynamic strategy, I imagine join_collapse_limit would mean
> something different (one technique in the literature is to refine the
> greedy output on subproblems of size 4-7), or be replaced with a
> different tuning knob.
>
> Anyway, these questions reinforce that this is potential follow-up work.
>
Agreed.
regards
--
Tomas Vondra
| From | Date | Subject | |
|---|---|---|---|
| Next Message | shveta malik | 2026-05-14 08:52:11 | Re: Proposal: Conflict log history table for Logical Replication |
| Previous Message | Etsuro Fujita | 2026-05-14 08:47:17 | Re: [PATCH] Fix column name escaping in postgres_fdw stats import |