| From: | lakshmi <lakshmigcdac(at)gmail(dot)com> |
|---|---|
| To: | Tomas Vondra <tomas(at)vondra(dot)me> |
| Cc: | Chengpeng Yan <chengpeng_yan(at)outlook(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, John Naylor <johncnaylorls(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Add a greedy join search algorithm to handle large join problems |
| Date: | 2026-02-16 11:26:13 |
| Message-ID: | CAEvyyTjuxiaL4_brvTsA3DZVcq5qgZS=yxFyNqoMtQWVZNiT1A@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Tomas,
Thank you for the clarification. You’re right that the synthetic test
doesn’t strongly constrain join order and, with the default
join_collapse_limit , the DP comparison was limited.
I’m now preparing follow-up tests using selected JOB queries and adjusted
planner settings to provide a fair and more realistic comparison. I’ll
share the results soon.
Regards,
Lakshmi
On Mon, Feb 16, 2026 at 4:27 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
>
> On 2/16/26 11:44, lakshmi wrote:
> > Hi Tomas,
> >
> > Thank you for the question.
> > The 15-table and 20-table results I shared were obtained using a
> > synthetic join workload designed to stress join-order planning and
> > measure planning-time scaling, rather than a JOB or TPC-H query.
> > Each query is essentially a left-deep chain of equality joins over
> > simple tables. For reference, the structure is equivalent to:
> >
>
> Isn't the left-deep shape more a feature of the plan than the query?
> With all the joins using the same "id" attribute, there will be one huge
> equivalence class, so AFAICS this does not really force any particular
> order (and we could easily pick a bushy plan).
>
> >
> > 15-table join
> >
> > SELECT count(*)
> >
> > FROM t1
> >
> > JOIN t2 ON t1.id <http://t1.id> = t2.id <http://t2.id>
> >
> > JOIN t3 ON t2.id <http://t2.id> = t3.id <http://t3.id>
> >
> > ...
> >
> > JOIN t15 ON t14.id <http://t14.id> = t15.id <http://t15.id>;
> >
> >
> >
> >
> > 20-table join
> >
> >
> >
> > SELECT count(*)
> >
> > FROM t1
> >
> > JOIN t2 ON t1.id <http://t1.id> = t2.id <http://t2.id>
> >
> > JOIN t3 ON t2.id <http://t2.id> = t3.id <http://t3.id>
> >
> > ...
> >
> > JOIN t20 ON t19.id <http://t19.id> = t20.id <http://t20.id>;
> >
> >
> > Regarding planner settings:
> >
> >
> > -geqo_thresholdwas set to:
> >
> > a high value (e.g., 100) to force DP
> >
> > a low value (e.g., 2) to allow GEQO/GOO
> >
> >
> >
> > -enable_goo_join_searchwas toggled on/off depending on the comparison
> > being measured.
> >
> >
> >
> > -Other planner parameters, including join_collapse_limit, were left at
> > their default values.
> >
>
> Well, this means the DP problem was limited to 8. Is that a fair
> comparison for the other algorithms?
>
> >
> > So these experiments mainly evaluate planning-time scaling and basic
> > plan sanity on a controlled join graph, rather than realistic workload
> > plan quality.
> >
> > I’m currently preparing additional tests using selected JOB queries to
> > provide more meaningful plan-quality comparisons and will share those
> > results once available.
> >
>
> OK
>
>
> --
> Tomas Vondra
>
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jakub Wartak | 2026-02-16 11:34:17 | Re: 64-bit wait_event and introduction of 32-bit wait_event_arg |
| Previous Message | Bertrand Drouvot | 2026-02-16 11:17:32 | Re: Fix uninitialized xl_running_xacts padding |