inner join removal

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: inner join removal
Date: 2010-07-08 18:27:43
Message-ID: AANLkTinS_MlZ2F3Siwgcje--qf5nTJTpuHFnZdcA45bU@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

So I sat down to work on some code to do inner join removal, and
quickly got stuck. As a first step, I wanted to write some code that
would identify inner joins that could be treated as LEFT JOINs if we
so chose. Consider:

SELECT 1 FROM foo, bar WHERE foo.x = bar.x;

If it so happens that there is a left join constraint from either foo
(x) or bar (x) referencing the other one, then the referenced column
has a unique index on it; if, further, the referencing column is not
null, then every row in the referencing table will have exactly one
partner in the referenced table. Thus, the join can be implemented as
a LEFT JOIN without changing the results. In this case, that also
means the join can be removed, but it can be a win anyway. Consider:

SELECT * FROM foo LEFT JOIN (bar JOIN baz ON bar.y = baz.y) ON foo.x = bar.x;

If foo is itty bitty and bar and baz are enormous, it would be nice to
start by joining foo to bar and then joining the result to baz, but
that's not legal. However, if bar (y) references baz (y) and bar.y is
not null, then the inner join is equivalent to a left join and it's OK
to commute them. Anyway, I didn't get near as far as actually trying
to implement this fancy footwork because I got stuck on a much simpler
problem. I thought that it would make sense to iterate through all
the relations in the query and look at the join clauses for each one.
So in the above query, for example, I was expecting that when I looked
at the RelOptInfo for baz, its joinlist would contain bar.y = baz.y.
It doesn't, because bar.y = baz.y gets eaten by the equivalence class
machinery.

Essentially, what I want to do is ask, for each relation, whether
there is exactly one other relation that gets joined to it, and then
if that proves to be the case, get a list of all the join clauses and
examine them. But I can't figure out how to do that.

Help?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2010-07-08 18:40:51 Re: [COMMITTERS] pgsql: Add note that using PL/Python 2 and 3 in the same session will
Previous Message Josh Berkus 2010-07-08 17:10:28 Re: Admission Control