Re: left-deep plans?

From: Kenneth Marshall <ktm(at)it(dot)is(dot)rice(dot)edu>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: left-deep plans?
Date: 2005-02-22 13:57:51
Message-ID: 20050222135751.GA13585@it.is.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 22, 2005 at 05:40:40PM +1100, Neil Conway wrote:
> Tom Lane wrote:
> >Yes, and it's been rejected. The notion is obviously bogus; it amounts
> >to assuming that every database is a star schema with only one core table.
>
> Interesting; yes, I suppose that's true.
>
> >Once we get into GEQO territory, we are using the left-deep-only
> >heuristic because that's the only kind of plan GEQO can construct.
> >But at that point you've already given up any notion of exhaustive
> >search.
>
> I think most applications would prefer an exhaustive, deterministic
> search of a subset of the search space over a non-exhaustive,
> non-deterministic search of the same subset, given approximately the
> same performance. In other words, if confining the search to left-deep
> plans allows people to use the normal planner in situations where they
> would normally be forced to use GEQO to get acceptable performance, I
> think that would be a win.
>

GEQO is an attempt to provide a near-optimal join order without using
an exhaustive search. "An exhaustive, deterministic search of a subset
of the search space" has a non-zero probability of finding only a local
minimum in execution time. Most users really are only concerned with
final execution time and if the "subset" is mis-chosen the resulting
exhaustive search would completely and possibly disasterously miss the
optimal execution time.

In a paper on join-order optimization that I read recently, it was shown
that non-uniform sampling, with the application of cost-based pruning of
the search space, provided good results with quick convergence to within
a delta of the optimum plan. This has the nice property that you can
balence result quality with the amount of time spent optimizing the join
order. The piece missing from the current GECO algorithm is to ensure
that once the cost of a join plan is larger (or a piece of a plan), never
visit that plan again. The memory/learning piece is what provided the
ability to explore much more of the desirable (low execution time) search
space.

Since purely random selection, with learning, works so well, it may be
worth developing a new random join-order optimization algorithm, particularly
since most of the "genetic" features of the GECO we are not using.

Ken

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bostjan Potocnik 2005-02-22 14:47:18 big problem
Previous Message Robert Treat 2005-02-22 13:36:05 Re: Get rid of system attributes in pg_attribute?