Re: Parameterized-path cost comparisons need some work

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Parameterized-path cost comparisons need some work
Date: 2012-04-13 14:17:48
Message-ID: CA+TgmoY1GxgZB-Dk3YjA6bMP4r=npW-HbWVrbmMw4uDK302yRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 12, 2012 at 3:27 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> 1. Lobotomize add_path_precheck so it always returns true for a
> parameterized path.  This sounds horrid, but in the test cases I'm using
> it seems that this only results in doing the full path construction for
> a very small number of additional paths.

Query planner engineering is hard, because it's hard to predict what
kind of queries people will write, but this seems basically sane to
me. Given that (if I recall our previous discuss on this point
correctly) we avoid generating parameterized paths in situations where
we could have simply revised the join order instead, we should only
really be getting any parameterized paths at all in situations where
they are likely to help. Queries involving only inner joins, for
example, should never need a parameterized path covering more than a
single baserel; and even if you've got outer joins in the mix, most of
my queries tend to look like A IJ B IJ C IJ D LJ E LJ F LJ G, rather
than A IJ (B LJ C) which is where we actually need this machinery. If
we spend a little more effort in that case it's quite likely to be
worth it; the trick is just to keep the extra work from bleeding into
the cases where it won't help.

> 3. Rearrange plan generation so that a parameterized path always uses
> all join clauses available from the specified outer rels.  (Any that
> don't work as indexquals would have to be applied as filter conditions.)
> If we did that, then we would be back to a situation where all paths
> with the same parameterization should yield the same rowcount, thus
> justifying letting add_path_precheck work as it does now.
>
> #3 would amount to pushing quals that would otherwise be checked at the
> nestloop join node down to the lowest inner-relation level where they
> could be checked.  This is something I'd suspected would be a good idea
> to start with, but hadn't gotten around to implementing for non-index
> quals.  It had not occurred to me that it might simplify cost estimation
> to always do that.

This seems like it could be quite a significant win. It doesn't
really matter in <= 9.1 because in an old-style parameterized nestloop
the join filter is going to get applied immediately after the index
filter anyway, though I guess it's possible that you might save a
little bit by optimizing the order in which multiple filter conditions
are applied. But if there can be intermediate joins in there then
it's a big deal; and the fact that it makes it easier to compare paths
and prune away bad ones earlier seems like a major benefit as well.
So I would be in favor of doing this if possible, but...

> I'm going to take a closer look at #3, but it may not be practical to
> try to squeeze it into 9.2; if not, I think #1 will do as a stopgap.

....I agree with this, too.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-04-13 14:32:25 Improving our clauseless-join heuristics
Previous Message Tom Lane 2012-04-13 13:51:27 Re: Memory usage during sorting