Re: Wrong query results caused by loss of join quals

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: Wrong query results caused by loss of join quals
Date: 2023-02-19 20:56:22
Message-ID: 3328238.1676840182@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Richard Guo <guofenglinux(at)gmail(dot)com> writes:
> ... As we can see, the join qual 't2.a = t2.b' disappears in the plan, and
> that results in the wrong query results.

Ugh.

> I did some dig and here is what happened. Firstly both sides of qual
> 't2.a = t2.b' could be nulled by the OJ t1/t2 and they are marked so in
> their varnullingrels. Then we decide that this qual can form a EC, and
> the EC's ec_relids is marked as {t2, t1/t2}. Note that t1 is not
> included in this ec_relids. So when it comes to building joinrel for
> t1/t2, generate_join_implied_equalities fails to generate the join qual
> from that EC.

Hmm. My intention for this sort of case was that the nulled Vars should
look like "new_members" to generate_join_implied_equalities_normal,
since they are computable at the join node (in filter not join quals)
but not computable within either input. Then it would generate the
necessary quals to equate them to each other. The reason that that
doesn't happen is that get_common_eclass_indexes believes it can ignore
ECs that don't mention t1. The attached quick hack is enough to fix
the presented case, but:

* I suspect the other use of get_common_eclass_indexes, in
have_relevant_eclass_joinclause, is broken as well.

* This fix throws away a fair bit of the optimization intended by
3373c7155, since it will result in examining some irrelevant ECs.
I'm not sure if it's worth complicating get_common_eclass_indexes
to try to recover that by adding knowledge about outer joins.

* I'm now kind of wondering whether there are pre-existing bugs of the
same ilk. Maybe not, because before 2489d76c4 an EC constraint that was
computable at the join but not earlier would have to have mentioned both
sides of the join ... but I'm not quite sure.

BTW, while looking at this I saw that generate_join_implied_equalities'
calculation of nominal_join_relids is wrong for child rels, because it
fails to fold the join relid into that if appropriate. In cases similar
to this one, that could result in generate_join_implied_equalities_broken
doing the same sort of wrong thing, that is rejecting quals it should
have enforced at the join. I don't think the use of nominal_join_relids
added by this patch is affected, though: a Var mentioning the outer join
in varnullingrels would have to have some member of the RHS' base rels
in varno, and I think a parallel statement can be made about PHVs.

regards, tom lane

Attachment Content-Type Size
quick-hack-for-missing-join-quals.patch text/x-diff 697 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2023-02-19 22:16:43 Re: Missing free_var() at end of accum_sum_final()?
Previous Message Andres Freund 2023-02-19 20:46:10 Re: Handle TEMP_CONFIG for pg_regress style tests in pg_regress.c