Re: join plan with unexpected var clauses

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Luc Vlaming <luc(at)swarm64(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: join plan with unexpected var clauses
Date: 2021-02-03 16:25:38
Message-ID: 3921516.1612369538@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Luc Vlaming <luc(at)swarm64(dot)com> writes:
> Given the testcase we see that the outer semi join tries to join the
> outer with the inner table id columns, even though the middle table id
> column is also there. Is this expected behavior?

I don't see anything greatly wrong with it. The planner has concluded
that _inner.id2 and middle.id1 are part of an equivalence class, so it
can form the top-level join by equating _outer.id3 to either of them.
AFAIR that choice is made at random --- there's certainly not any logic
that thinks about "well, the intermediate join output could be a bit
narrower if we choose this one instead of that one".

I think "made at random" actually boils down to "take the first usable
member of the equivalence class". If I switch around the wording of
the first equality condition:

... select 1 from _inner where middle.id1 = _inner.id2

then I get a plan where the top join uses middle.id1. However,
it's still propagating both middle.id1 and _inner.id2 up through
the bottom join, so that isn't buying anything efficiency-wise.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-02-03 16:29:37 Re: Removing support for COPY FROM STDIN in protocol version 2
Previous Message Daniel Gustafsson 2021-02-03 16:15:16 Re: Online checksums patch - once again