Re: BUG #17978: Unexpected error: "wrong varnullingrels (b) (expected (b 5)) for Var 6/2" triggered by JOIN

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: zuming(dot)jiang(at)inf(dot)ethz(dot)ch, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17978: Unexpected error: "wrong varnullingrels (b) (expected (b 5)) for Var 6/2" triggered by JOIN
Date: 2023-06-19 16:07:04
Message-ID: 2979923.1687190824@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Richard Guo <guofenglinux(at)gmail(dot)com> writes:
> FWIW, I found that we have an existing test query in sql/join.sql that
> almost exposes this issue.

> explain (costs off)
> select * from int4_tbl t1
> left join ((select t2.f1 from int4_tbl t2
> left join int4_tbl t3 on t2.f1 > 0
> where t3.f1 is null) s
> left join tenk1 t4 on s.f1 > 1)
> on s.f1 = t1.f1;

> If we change the WHERE clause to 't2.f1 != coalesce(t3.f1, 1)', we will
> see this issue.

Interesting that you should bring up that query, because after
quick-hacking a fix in remove_useless_results(), all the existing
test cases pass except that that one changes plan: it now does the
joins in syntactic order, whereas HEAD finds a plan that joins t4
last. Joining t4 last has a noticeably better cost estimate than
the syntactic-order plan, so it's kind of sad that we can't find
it anymore. On the other hand, v15 and before don't find it either,
so this wouldn't be a regression.

According to inspection of the SpecialJoinInfos, HEAD believes
that the t2/t3 outer join of that query could commute according
to identity 3 with the outermost left join between t1 and t2.
It doesn't act on that observation, and I'm not sure it's even
correct to believe that. This patch doesn't believe that anymore,
but it still believes that the t1/t2 join could commute with the
t2/t4 join. Why it no longer acts on that is unclear. There is
probably some buglet or missed opportunity somewhere.

Even odder is that given your modified query in which the WHERE
clause mentions both t2 and t3, it *does* find the plan that
joins t4 last, as shown in the new test case. This makes me
wonder if it's something triggered by the avoid-clauseless-joins
heuristic, though I can't quite see how that'd apply.

Anyway, what I'm inclined to do is flesh out the attached by updating
the comments for remove_useless_results() and then push it. Later
on we can look for why it's not finding the better join order; that's
a separable issue, and if it is about avoid-clauseless-joins then we
might choose to live with it rather than incur a lot of planner cost
to fix it.

regards, tom lane

Attachment Content-Type Size
draft-patch-for-bug-17978.patch text/x-diff 3.5 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2023-06-19 16:08:47 Re: BUG #17982: Inconsistent results of SELECT with CTE caused by subquery comparison
Previous Message Laurenz Albe 2023-06-19 13:34:24 Re: BUG #17981: HY000 server closed the connection unexpectedly