Re: *_collapse_limit, geqo_threshold

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Dimitri Fontaine <dim(at)hi-media(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: *_collapse_limit, geqo_threshold
Date: 2009-07-08 20:57:33
Message-ID: 10048.1247086653@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Jul 7, 2009 at 6:33 PM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> It's pretty much all-or-nothing now: the GUC does not give you any sort
>> of useful control over *which* joins are reorderable.

> Yes. So the way I see it, the options are:

> 1. We can remove join_collapse_limit completely and provide no
> substitute. In this case, the ability to explicitly specify the join
> order will be gone.

> 2. We can remove join_collapse_limit but provide a different, Boolean
> GUC instead, like enable_join_reordering. In this case, we're not
> actually reducing the number of GUCs, just the size of the foot-gun.

> 3. We can remove join_collapse_limit and provide an alternative way to
> explicitly specify the join order that is more flexible. This both
> reduces the number of GUCs and arguably provides some useful
> functionality that we don't have now.

> It sounds like your vote is for #2, which, as I say, seems like a
> feature with one arm tied behind its back, but hey, what do I know?

Well, the reason I'm not voting for #3 is that it looks like a lot of
work to implement something that would basically be a planner hint,
which I'm generally against; furthermore, it's a hint that there's been
no demand for. (We're not even certain that anyone is using the ability
to *fully* specify the join order, much less wanting some undetermined
compromise between manual and automatic control.) And anyway I didn't
hear anyone volunteering to do it. So the realistic alternatives are
#1, #2, or "do nothing"; and out of those I like #2.

> Accepting that as the consensus in the absence of contrary votes, we
> still need to decide what to do about from_collapse_threshold and
> geqo_threshold. I'm pretty sure that we shouldn't eliminate GEQO or
> geqo_threshold, because the basic algorithm is clearly exponential
> time and eventually you have to start worrying about that, but we
> could raise the value. What to do about from_collapse_threshold is
> less clear to me.

I do not think there is a good argument for eliminating geqo_threshold.
There might well be an argument for cranking up its default value;
but that would take some hard data, which seems lacking at the moment.

I'm on the fence about from_collapse_threshold. The argument for having
it seems to be that there might be cases where not folding a subquery
is preferable to folding it and then taking your chances with GEQO.
But I'm not really convinced there are any.

It occurs to me that one way to make GEQO less scary would be to take
out the nondeterminism by resetting its random number generator for
each query. You might get a good plan or an awful one, but at least
it'd be the same one each time. DBAs like predictability.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2009-07-08 21:13:11 Re: *_collapse_limit, geqo_threshold
Previous Message Mike Rylander 2009-07-08 20:18:31 Fwd: tsvector extraction patch