Re: Re: [COMMITTERS] pgsql: Rewrite GEQO's gimme_tree function so that it always finds a

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Greg Sabino Mullane <greg(at)turnstep(dot)com>
Subject: Re: Re: [COMMITTERS] pgsql: Rewrite GEQO's gimme_tree function so that it always finds a
Date: 2009-11-28 01:23:08
Message-ID: 2230.1259371388@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, Nov 9, 2009 at 1:10 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Yowza. 18000 distinct paths for one relation? Could we see the test
>> case?

> Hey, wait a minute. Unless I'm missing something, the items being
> accumulated here are not paths (as Tom said upthread and I naively
> believed), but RelOptInfos.

Yeah, I failed to think carefully about that.

> It looks like we create a RelOptInfo for
> every legal join order, so this is going to happen on any large-ish
> query where the joins can be reordered relatively freely.

Actually, it's more like a RelOptInfo for every combination of base
rels that could be formed along any legal join path (where "legal"
includes the heuristics that avoid clauseless joins when possible).
So the worst case is 2^n for n base rels, but usually considerably
fewer. Nonetheless, even a small fraction of 2^37 is Too Many.

> I don't think there's any easy fix for this.

Nope :-(. I was able to get rid of the specific O(N^2) behavior that
you were running into, but that just pushes the problem out a bit.
FWIW, a profile of CVS HEAD running on this query looks like

samples % symbol name
208189 15.9398 compare_fuzzy_path_costs
116260 8.9014 add_path
97509 7.4657 AllocSetAlloc
78354 5.9991 make_canonical_pathkey
69027 5.2850 generate_join_implied_equalities
65153 4.9884 cost_nestloop
59689 4.5700 bms_is_subset
49176 3.7651 add_paths_to_joinrel
39720 3.0411 bms_overlap
32562 2.4931 cost_mergejoin
30101 2.3047 pathkeys_useful_for_merging
26409 2.0220 AllocSetFree
24459 1.8727 MemoryContextAllocZeroAligned
23247 1.7799 hash_search_with_hash_value
16018 1.2264 check_list_invariants

which is at least unsurprising. However, it eats memory fast enough
to start swapping after level 7 or so, so there is no way that the
exhaustive planner is ever going to finish this problem.

> Playing around with this a bit, I was easily able to get 2-second
> planing times on 15 table join, 6-second planning times on a 16 table
> join and 30-second planning times on a 17 table join. This makes it
> hard to support raising the GEQO threshold, as most recently suggested
> by Greg Sabino Mullane here (and previously by me on an earlier
> thread):

Yeah. I think GEQO or other approximate methods are the only hope for
very large join problems. We could however hope to get to the point of
being able to raise the collapse limits, rather than keeping them
below the geqo_threshold by default. In principle that should result
in better plans ...

regards, tom lane

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2009-11-28 01:32:32 Re: Re: [COMMITTERS] pgsql: Rewrite GEQO's gimme_tree function so that it always finds a
Previous Message Tom Lane 2009-11-28 00:46:19 pgsql: Eliminate a lot of list-management overhead within

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2009-11-28 01:28:13 Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
Previous Message marcin mank 2009-11-28 00:33:21 Re: Re: [COMMITTERS] pgsql: Rewrite GEQO's gimme_tree function so that it always finds a