Re: Review remove {join,from}_collapse_limit, add enable_join_ordering

From: Kenneth Marshall <ktm(at)rice(dot)edu>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Subject: Re: Review remove {join,from}_collapse_limit, add enable_join_ordering
Date: 2009-07-16 20:22:30
Message-ID: 20090716202230.GA1452@it.is.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 16, 2009 at 06:49:08PM +0200, Andres Freund wrote:
> On Thursday 16 July 2009 17:59:58 Tom Lane wrote:
> > Andres Freund <andres(at)anarazel(dot)de> writes:
> > > The default settings currently make it relatively hard to trigger geqo at
> > > all.
> >
> > Yes, and that was intentional. One of the implications of what we're
> > discussing here is that geqo would get used a lot more for "typical
> > complex queries" (if there is any such thing as a typical one). So
> > it's fully to be expected that the fallout would be pressure to improve
> > geqo in various ways.
> >
> > Given that we are at the start of the development cycle, that prospect
> > doesn't scare me --- there's plenty of time to fix whatever needs
> > fixing. However, I am leaning to the feeling that I don't want to be
> > putting people in a position where they have no alternative but to use
> > geqo. So adjusting rather than removing the collapse limits is seeming
> > like a good idea.
> Hm. I see a, a bit more fundamental problem with geqo:
> I tried several queries, and I found not a single one, where the whole
> genetical process did any significant improvments to the 'worth'.
> It seems that always the best variant out of the pool is either the path
> choosen in the end, or at least the cost difference is _really_ low.
>
>
> Andres
>

Hi Andres,

From some of my reading of the literature on join order
optimization via random sampling, such as what would establish
the initial GEQO pool, there is a very good possibility of having
a "pretty good" plan in the first pool, especially for our larger
initial pool sizes of 100-1000. And in fact, the final plan has
a good chance of being of approximately the same cost as a member
of the initial pool. Uniform sampling alone can give you a close
to optimum plan 80% of the time with an initial sample size of
100. And using biased sampling raises that to 99% or better.

Regards,
Ken

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2009-07-16 20:31:23 Re: Docbook toolchain interfering with patch review?
Previous Message Heikki Linnakangas 2009-07-16 20:21:32 Re: pg_stat_activity.application_name