Re: join removal

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alex Brasetvik <alex(at)brasetvik(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: join removal
Date: 2009-07-24 13:37:14
Message-ID: 603c8f070907240637x743cac64pf970876e159c8be4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 24, 2009 at 7:53 AM, Alex Brasetvik<alex(at)brasetvik(dot)com> wrote:
>
> On Jul 17, 2009, at 04:27 , Robert Haas wrote:
>
>> - INNER joins are more complex because what happens on the inner side
>> of the join can potentially wipe out rows from the result.  With a
>> LEFT join, it's sufficient to prove that the inner rel is at least
>> unique enough, but for an INNER join, we have to prove that it's
>> exactly UNIQUE enough.  I think we can only provide this when the
>> inner rel is a base relation with a unique index over EXACTLY (not a
>> subset of) the relevant columns AND there is a foreign key
>> relationship from the outer rel to the inner rel over the join
>> columns.
>
> Reasoning on foreign key relationships opens up for other optimization
> opportunities as well, so being able to prove that a join cannot alter the
> number of rows would be nice.
>
> For example, Limit-operators can possibly be pushed below a join that does
> not alter the result set, to reduce the amount of work done by the join.

Interesting, I hadn't thought about that, but it's an excellent point.
Another case that comes up is:

A LEFT JOIN (B INNER JOIN C ON Pbc) ON Pab

In general, this doesn't commute, because you need to emit a
NULL-extended copy of A whenever Pab has no match in B INNER JOIN C ON
Pbc. But if you know that Pbc will always be satisfied for exactly
one row in B, then you can decide to implement the join between B and
C as a left join rather than an inner join, so you get this:

A LEFT JOIN (B LEFT JOIN C ON Pbc) ON Pab

Now it commutes:

(A LEFT JOIN B ON Pab) LEFT JOIN C ON Pbc

I'm going to try to get the basic join removal code (for left joins,
which don't need foreign-key deduction) done for CommitFest 2009-09.
The next step is the foreign key deduction so we can remove inner
joins, but I'm not sure I'll have that for 8.5 unless someone wants to
either pitch in or cough up some money. Reordering joins around
limits is, I suspect, very difficult indeed, so should probably be a
project for phase 3.

...Robert

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-07-24 14:01:10 Re: When is a record NULL?
Previous Message Joshua Tolley 2009-07-24 13:36:46 Re: When is a record NULL?