Re: Performance improvement for joins where outer side is unique

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance improvement for joins where outer side is unique
Date: 2015-03-04 09:00:31
Message-ID: CAApHDvp3SwmJCng0E778XnqafjNLmAVCXfhQ1MUEYqM4Ob6=kw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On 27 February 2015 at 06:48, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

> On 26.2.2015 18:34, Tom Lane wrote:
> > Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
> >> FWIW this apparently happens because the code only expect that
> >> EquivalenceMembers only contain Var, but in this particular case that's
> >> not the case - it contains RelabelType (because oprcode is regproc, and
> >> needs to be relabeled to oid).
> >
> > If it thinks an EquivalenceMember must be a Var, it's outright
> > broken; I'm pretty sure any nonvolatile expression is possible.
>
> I came to the same conclusion, because even with the RelabelType fix
> it's trivial to crash it with a query like this:
>
> SELECT 1 FROM pg_proc p JOIN pg_operator o
> ON oprcode = (p.oid::int4 + 1);
>
>
Thanks for looking at this Tomas. Sorry it's taken this long for me to
respond, but I wanted to do so with a working patch.

I've made a few changes in the attached version:

1. Fixed Assert failure when eclass contained non-Var types, as reported by
you.
2. Added support for expression indexes.

The expression indexes should really be supported as with the previous
patch they worked ok with LEFT JOINs, but not INNER JOINs, that
inconsistency is pretty much a bug in my book, so I've fixed it.

The one weird quirk with the patch is that, if we had some tables like:

create table r1 (id int primary key, value int not null);
create table r2 (id int primary key);

And a query:
explain verbose select * from r1 inner join r2 on r1.id=r2.id where r2.id
=r1.value;

The join is not properly detected as a unique join. This is down to the
eclass containing 3 members, when the code finds the 2nd ec member for r1
it returns false as it already found another one. I'm not quite sure what
the fix is for this just yet as it's not quite clear to me how the code
would work if there were 2 vars from each relation in the same eclass... If
these Vars were of different types then which operators would we use for
them? I'm not sure if having eclassjoin_is_unique_join() append every
possible combination to index_exprs is the answer. I'm also not quite sure
if the complexity is worth the extra code either.

Updated patch attached.

Thank you for looking it and reporting that bug.

Regards

David Rowley

Attachment Content-Type Size
unijoin_2015-03-04_ac455bd.patch application/octet-stream 57.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2015-03-04 09:04:03 Re: Performance improvement for joins where outer side is unique
Previous Message Shigeru Hanada 2015-03-04 08:57:21 Re: Join push-down support for foreign tables