pgsql: Adjust definition of cheapest_total_path to work better with LAT

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Adjust definition of cheapest_total_path to work better with LAT
Date: 2012-08-30 02:06:24
Message-ID: E1T6u9I-0000cn-Vj@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Adjust definition of cheapest_total_path to work better with LATERAL.

In the initial cut at LATERAL, I kept the rule that cheapest_total_path
was always unparameterized, which meant it had to be NULL if the relation
has no unparameterized paths. It turns out to work much more nicely if
we always have *some* path nominated as cheapest-total for each relation.
In particular, let's still say it's the cheapest unparameterized path if
there is one; if not, take the cheapest-total-cost path among those of
the minimum available parameterization. (The first rule is actually
a special case of the second.)

This allows reversion of some temporary lobotomizations I'd put in place.
In particular, the planner can now consider hash and merge joins for
joins below a parameter-supplying nestloop, even if there aren't any
unparameterized paths available. This should bring planning of
LATERAL-containing queries to the same level as queries not using that
feature.

Along the way, simplify management of parameterized paths in add_path()
and friends. In the original coding for parameterized paths in 9.2,
I tried to minimize the logic changes in add_path(), so it just treated
parameterization as yet another dimension of comparison for paths.
We later made it ignore pathkeys (sort ordering) of parameterized paths,
on the grounds that ordering isn't a useful property for the path on the
inside of a nestloop, so we might as well get rid of useless parameterized
paths as quickly as possible. But we didn't take that reasoning as far as
we should have. Startup cost isn't a useful property inside a nestloop
either, so add_path() ought to discount startup cost of parameterized paths
as well. Having done that, the secondary sorting I'd implemented (in
add_parameterized_path) is no longer needed --- any parameterized path that
survives add_path() at all is worth considering at higher levels. So this
should be a bit faster as well as simpler.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/e83bb10d6dcf05a666d4ada00d9788c7974ad378

Modified Files
--------------
src/backend/optimizer/README | 17 ++-
src/backend/optimizer/geqo/geqo_eval.c | 10 +-
src/backend/optimizer/path/allpaths.c | 3 +-
src/backend/optimizer/path/joinpath.c | 85 +++++----
src/backend/optimizer/plan/planmain.c | 3 +-
src/backend/optimizer/util/pathnode.c | 320 +++++++++++++-------------------
src/include/nodes/relation.h | 18 +-
src/test/regress/expected/join.out | 13 +-
8 files changed, 217 insertions(+), 252 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Peter Eisentraut 2012-08-30 03:06:34 pgsql: Also check for Python platform-specific include directory
Previous Message Bruce Momjian 2012-08-30 01:45:44 pgsql: Document that NOTIFY events are visible to all users.