Re: Performance improvement for joins where outer side is unique

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance improvement for joins where outer side is unique
Date: 2017-04-06 23:47:50
Message-ID: 18580.1491522470@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
> On 7 April 2017 at 07:26, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I'm looking through this, and I'm failing to see where it deals with
>> the problem we discussed last time, namely that you can't apply the
>> optimization unless all clauses that were used in the uniqueness
>> proof are included in the join's merge/hash conditions + joinquals.

> The code in question is:
> mergestate->mj_SkipMarkRestore = !mergestate->js.joinqual &&
> mergestate->js.first_inner_tuple_only;

Uh, AFAICS that only protects the skip-mark-and-restore logic.
What I'm on about is that you can't do the early advance to the
next outer tuple unless you're sure that all the quals that were
relevant to the uniqueness proof have been checked for the current
inner tuple. That affects all three join types not only merge.

The case that would be relevant to this is, eg,

create table t1 (f1 int, f2 int, primary key(f1,f2));

select * from t_outer left join t1 on (t_outer.f1 = t1.f1)
where t_outer.f2 = t2.f2;

Your existing patch would think t1 is unique-inner, but the qual pushed
down from WHERE would not be a joinqual so the wrong thing would happen
at runtime.

(Hm ... actually, this example wouldn't fail as written because
the WHERE qual is probably strict, so the left join would get
reduced to an inner join and then pushed-down-ness no longer
matters. But hopefully you get my drift.)

> I don't really think the List idea would be nearly as efficient.

No, what I'm saying is that each RelOptInfo would contain a single List of
Relids of proven-unique-for outer rels (and another one for the negative
cache). No array, no more searching than you have now, just removal of an
uncertainly-safe array fetch.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2017-04-06 23:53:31 Re: Making clausesel.c Smarter
Previous Message Beena Emerson 2017-04-06 23:38:15 Re: increasing the default WAL segment size