Re: Performance improvement for joins where outer side is unique

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance improvement for joins where outer side is unique
Date: 2017-04-06 19:26:35
Message-ID: 4444.1491506795@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
> On 2 April 2017 at 21:21, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
>> I've attached an updated patch which updates the regression test output of
>> a recent commit to include the "Unique Inner" in the expected results.

> The patch must've fallen off. Attempt number 2 at attaching.

I'm looking through this, and I'm failing to see where it deals with
the problem we discussed last time, namely that you can't apply the
optimization unless all clauses that were used in the uniqueness
proof are included in the join's merge/hash conditions + joinquals.

It might be sufficient to ignore is_pushed_down conditions (at an
outer join only) while trying to make the proofs; but if it's doing
that, I don't see where.

I don't especially like the centralized unique_rels cache structure.
It's not terribly clear what it's for, and you're making uncomfortably
large assumptions about never indexing off the end of the array, despite
not having any range checks for the subscripts. Wouldn't it be better to
add simple List fields into RelOptInfo, representing the outer rels this
rel has been proven unique or not-unique for? That would dodge the array
size question and would be more readily extensible to someday applying
this to join rels.

I also think some more thought is needed about handling JOIN_UNIQUE_OUTER
and JOIN_UNIQUE_INNER cases. In the first place, the patch cavalierly
violates the statement in joinpath.c that those jointype values never
propagate outside that module. In the second place, shouldn't
JOIN_UNIQUE_INNER automatically result in the path getting marked
inner_unique? I suspect the logic in add_paths_to_joinrel ought to
look something like

if (jointype == JOIN_UNIQUE_INNER)
extra.inner_unique = true;
else
extra.inner_unique = innerrel_is_unique(root, outerrel, innerrel,
(jointype == JOIN_UNIQUE_OUTER ? JOIN_INNER : jointype),
restrictlist);

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2017-04-06 19:33:40 Re: [HACKERS] [GSoC] Push-based query executor discussion
Previous Message Peter Eisentraut 2017-04-06 19:01:29 Re: CONNECTION LIMIT and Parallel Query don't play well together