Re: *_collapse_limit, geqo_threshold

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: *_collapse_limit, geqo_threshold
Date: 2009-07-07 15:11:49
Message-ID: BE7B2EE0-66A6-45A3-BF89-C7805C835AF0@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jul 7, 2009, at 9:31 AM, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov
> wrote:

> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> I'm interested in hearing from anyone who has practical experience
>> with tuning these variables, or any ideas on what we should test to
>> get a better idea as to how to set them.
>
> I don't remember any clear resolution to the wild variations in plan
> time mentioned here:
>
> http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php
>
> I think it would be prudent to try to figure out why small changes in
> the query caused the large changes in the plan times Andres was
> seeing. Has anyone else ever seen such behavior? Can we get
> examples? (It should be enough to get the statistics and the schema,
> since this is about planning time, not run time.)

Well, there's not really enough information there to figure out
specifically what was happening, but from 10,000 feet,
join_collapse_limit and from_collapse_limit constrain the join order.
If the estimates are all accurate, setting them to a value < infinity
will either leave the plans unchanged or make them worse. If it's
making them better, then the estimates are off and the join order
constraint happens to be preventing the planner from considering the
cases what really hurts you. But that's mostly luck.

> My own experience is that when we investigate a complaint about a
> query not performing to user or application programmer expectations,
> we have sometimes found that boosting these values has helped. We
> boost them overall (in postgresql.conf) without ever having seen a
> downside. We currently have geqo disabled and set both collapse
> limits to 20. We should probably just set them both to several
> hundred and not wait until some query with more than 20 tables
> performs badly, but I'm not sure we have any of those yet.
>
> In short, my experience is that when setting these higher has made any
> difference at all, it has always generated a plan that saved more time
> than the extra planning required. Well, I'd bet that there has been
> an increase in the plan time of some queries which wound up with the
> same plan anyway, but the difference has never been noticeable; the
> net
> effect has been a plus for us.

You have a big dataset AIUI so the right values for you might be too
high for some people with, say, OLTP workloads.

> I guess the question is whether there is anyone who has had a contrary
> experience. (There must have been some benchmarks to justify adding
> geqo at some point?)

GEQO or something like it is certainly needed for very large planning
problems. The non-GEQO planner takes exponential time in the size of
the problem, so at some point that's going to get ugly. But
triggering it at the level we do now seems unnecessarily pessimistic
about what constitutes too much planning.

...Robert

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-07-07 15:16:35 Re: GRANT ON ALL IN schema
Previous Message Tom Lane 2009-07-07 15:06:59 Re: Maintenance Policy?