Re: Add a greedy join search algorithm to handle large join problems

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: John Naylor <johncnaylorls(at)gmail(dot)com>, Chengpeng Yan <chengpeng_yan(at)outlook(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add a greedy join search algorithm to handle large join problems
Date: 2025-12-11 17:33:52
Message-ID: CAFj8pRASJuRQKHOoBTnR5aRUeRKpNAmrYQcBrQb=yqeZ_8me9Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

čt 11. 12. 2025 v 18:07 odesílatel Tomas Vondra <tomas(at)vondra(dot)me> napsal:

> On 12/11/25 07:12, Pavel Stehule wrote:
> >
> >
> > čt 11. 12. 2025 v 3:53 odesílatel John Naylor <johncnaylorls(at)gmail(dot)com
> > <mailto:johncnaylorls(at)gmail(dot)com>> napsal:
> >
> > On Wed, Dec 10, 2025 at 5:20 PM Tomas Vondra <tomas(at)vondra(dot)me
> > <mailto:tomas(at)vondra(dot)me>> wrote:
> > > I did however notice an interesting thing - running EXPLAIN on the
> 99
> > > queries (for 3 scales and 0/4 workers, so 6x 99) took this much
> time:
> > >
> > > master: 8s
> > > master/geqo: 20s
> > > master/goo: 5s
> >
> > > It's nice that "goo" seems to be faster than "geqo" - assuming the
> > plans
> > > are comparable or better. But it surprised me switching to geqo
> > makes it
> > > slower than master. That goes against my intuition that geqo is
> > meant to
> > > be cheaper/faster join order planning. But maybe I'm missing
> > something.
> >
> > Yeah, that was surprising. It seems that geqo has a large overhead,
> so
> > it takes a larger join problem for the asymptotic behavior to win
> over
> > exhaustive search.
> >
> >
> > If I understand correctly to design - geqo should be slower for any
> > queries with smaller complexity. The question is how many queries in the
> > tested model are really complex.
> >
>
> Depends on what you mean by "really complex". TPC-DS queries are not
> trivial, but the complexity may not be in the number of joins.
>
> Of course, setting geqo_threshold to 2 may be too aggressive. Not sure.
>

I checked the TPC-H queries and almost all queries are simple - 5 x JOIN --
2x nested subselect

>
>
> regards
>
> --
> Tomas Vondra
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bryan Green 2025-12-11 17:45:01 Re: [PATCH] Fix severe performance regression with gettext 0.20+ on Windows
Previous Message Tomas Vondra 2025-12-11 17:30:46 Re: Add a greedy join search algorithm to handle large join problems