Re: from_collapse_limit vs. geqo_threshold

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: from_collapse_limit vs. geqo_threshold
Date: 2009-05-21 12:13:09
Message-ID: 603c8f070905210513m29b1a916la3d0e53e8bf02f8d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 21, 2009 at 7:50 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> Robert,
>>> It appears that this statement has been in our documentation since Tom
>>> Lane added FROM_COLLAPSE_LIMIT (back then, it was capitalized) on
>>> January 25, 2003 (9bf97ff426de9), but I can't find any justification
>>> for it anywhere.  I think we either need to justify this advice, or
>>> remove it.
>
>> ... trying to remember why I wrote that ... what would happen if
>> FROM_COLLAPSE_LIMIT was *more* than GEQO_THRESHOLD?
>
> I think I wrote it, not you.  The point of the advice is to keep
> subquery collapsation (hm, what's the right noun form?  Need caffeine)
> from turning a non-GEQO query into a GEQO one, and thus subjecting
> you to unpredictable plans.  Maybe the resulting plans would be better
> on average, or maybe they wouldn't, but in any case they'd be
> unpredictable.

That's more or less what I figured, but my real world experience is
that pulling up subqueries and using GEQO leads to plans that are
random but tolerable, whereas not pulling up subqueries leads to plans
that are almost uniformly bad. Actually, it works OK if really would
have needed to materialize the entire subquery, but otherwise it
stinks. My real unvarnished opinion on this topic is that
from_collapse_limit is a loaded foot-gun waiting to go off. We might
as well have an option where if the number of tables in the query
exceeds a certain threshold, we'll just sequential-scan the table
rather than considering the use of indices. That option would
actually be better, because everyone who read the documentation would
be absolutely certain that they wanted to turn that option OFF,
whereas the behavior of from_collapse_limit is sufficiently complex
that it isn't obvious that it's a terrible idea.

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2009-05-21 12:30:33 Re: Compiler warning
Previous Message Tom Lane 2009-05-21 11:50:56 Re: from_collapse_limit vs. geqo_threshold