Re: Wrong query results caused by loss of join quals

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: Wrong query results caused by loss of join quals
Date: 2023-02-22 20:50:41
Message-ID: 283948.1677099041@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Richard Guo <guofenglinux(at)gmail(dot)com> writes:
> On Mon, Feb 20, 2023 at 4:56 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> * I suspect the other use of get_common_eclass_indexes, in
>> have_relevant_eclass_joinclause, is broken as well.

> It seems have_relevant_joinclause is broken for the presented case. It
> does not get a change to call have_relevant_eclass_joinclause, because
> flag 'has_eclass_joins' is not set for t1 due to t1 being not in the
> EC's ec_relids. As a result, have_relevant_joinclause thinks there is
> no joinclause that involves t1 and t2, which is not right.

I thought about this and decided that it's not really a problem.
have_relevant_joinclause is just a heuristic, and I don't think we
need to prioritize forming a join if the only relevant clauses look
like this. We won't be able to use such clauses for merge or hash,
so we're going to end up with an unconstrained nestloop, which isn't
something to be eager to form. The join ordering rules will take
care of forcing us to make the join when necessary.

>> * This fix throws away a fair bit of the optimization intended by
>> 3373c7155, since it will result in examining some irrelevant ECs.

> Yeah, this is also my concern that we'd lose some optimization about
> finding ECs.

The only easy improvement I can see to make here is to apply the old
rules at inner joins. Maybe it's worth complicating the data structures
to be smarter at outer joins, but I rather doubt it: we could easily
expend more overhead than we'll save here by examining irrelevant ECs.
In any case, if there is a useful optimization here, it can be pursued
later.

>> BTW, while looking at this I saw that generate_join_implied_equalities'
>> calculation of nominal_join_relids is wrong for child rels, because it
>> fails to fold the join relid into that if appropriate.

> I dug a little into this and it seems this is all right as-is.

I changed it anyway after noting that (a) passing in the ojrelid is
needful to be able to distinguish inner and outer joins, and
(b) the existing comment about the join_relids input is now wrong.
Even if it happens to not be borked for current callers, that seems
like a mighty fragile assumption.

Less-hasty v2 patch attached.

regards, tom lane

Attachment Content-Type Size
fix-for-missing-join-quals-2.patch text/x-diff 12.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-02-22 20:55:44 Re: pg_regress: Treat child process failure as test failure
Previous Message Daniel Gustafsson 2023-02-22 20:42:21 Re: pg_regress: Treat child process failure as test failure