Re: WIP patch for parameterized inner paths

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch for parameterized inner paths
Date: 2012-01-26 14:14:42
Message-ID: CA+Tgmoax=uJs78OiwqRGPUpSgNN+JOnYUV9nam4mXacLXT3uKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 25, 2012 at 11:24 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
>> Attached is a WIP patch for parameterized paths, along the
>> lines we have discussed before: ...
>
> I've made considerable progress on the TODO items I listed: indxpath.c
> has been ripped apart and restructured, and I have it considering
> parameterized paths for hash sub-joins.  I made a deliberate policy
> decision not to work very hard on exploring parameterized mergejoin
> plans, because that seems to inflate the number of paths considered way
> more than the relative usefulness of merge over hash joins justifies.

I don't fully understand this code, especially not on my first time
through it, but it seems to me that the key to getting the performance
of this code up to where we'd like it to be is to control the number
of useless paths that get generated.

Is there a guard in here against joining a parameterized path to an
intermediate relation when no SJ is involved? In other words, if
we're joining a parameterized path on A to a path on B, then either
the join to B should satisfy at least part of the parameterization
needed by A, or there should be a special join with A and B on one
side and a relation that satisfies at least part of the
parameterization of A on the other. In the probably not uncommon case
where there are no SJs at all or all such SJs have only a single rel
on the nullable side, we ought to be able to avoid creating any more
paths than we do right now. Even if there are SJs with multiple rels
on the outside, we could try to implement some fast check for whether
intermediate paths make any sense: e.g. union the set of rels on the
nullable sides of SJs. Then, if the joinrel whose paths we're trying
to build isn't a subset of that set, the only thing worth considering
at this level is satisfying a parameterization by building a
parameter-driven nestloop: a bigger parameterized path will do us no
good.

Maybe there's a heuristic like this already in there; I just didn't
see it on first glance. I guess I'm surprised my the amount of
increase you're seeing in paths considered, though admittedly I
haven't seen your test cases. If we're properly filtering out the
paths that don't matter, I wouldn't have expected it to have as much
impact as you're describing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Abhijit Menon-Sen 2012-01-26 14:18:15 Re: Simulating Clog Contention
Previous Message Tomas Vondra 2012-01-26 13:54:28 Re: PATCH: tracking temp files in pg_stat_database