Re: BUG #18187: Unexpected error: "variable not found in subplan target lists" triggered by JOIN

From: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Richard Guo <guofenglinux(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, zuming(dot)jiang(at)inf(dot)ethz(dot)ch, pgsql-bugs(at)lists(dot)postgresql(dot)org, PG Bug reporting form <noreply(at)postgresql(dot)org>
Subject: Re: BUG #18187: Unexpected error: "variable not found in subplan target lists" triggered by JOIN
Date: 2023-11-30 07:21:50
Message-ID: 85d91f13-b6d3-4d5c-8866-f2d9533c042a@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 28/11/2023 14:03, Richard Guo wrote:
>
> On Tue, Nov 28, 2023 at 1:42 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us
> <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>> wrote:
>
> Richard Guo <guofenglinux(at)gmail(dot)com <mailto:guofenglinux(at)gmail(dot)com>>
> writes:
> > After some more thought, I think PHVs should not impose any
> constraints
> > on removing self joins.  If the removed rel is contained in the
> relids
> > that a PHV is evaluated/needed at or laterally references, it can
> just
> > be replaced with the rel that is kept.
>
> I experimented with trying to break this patch using this test case:
>
> create table t (id int primary key, f1 text, f2 text);
> create table i (f0 int);
>
> explain verbose
> select * from i left join (
>   (select *, 1 as x from t tss1) t1 join
>   (select *, 2 as y from t tss2) t2
>   on t1.id <http://t1.id> = t2.id <http://t2.id>
> ) on i.f0 = t1.id <http://t1.id>;
>
>
> Thanks for the query!
>
> It seems to successfully remove the self-join between tss1 and tss2,
> but there is something wrong.  If you examine the PlannerInfo just
> after remove_useless_self_joins, you will notice that the two
> PlaceHolderVars (for x and y) in the placeholder_list both have
>
>          :phrels (b 7)
>          :phnullingrels (b)
>
> where beforehand x had
>
>          :phrels (b 6)
>          :phnullingrels (b 5)
>
> and y had
>
>          :phrels (b 7)
>          :phnullingrels (b 5)
>
> It's correct to move both phrels locations to rel 7 (the surviving
> self-joined rel) but what became of the reference to nullingrel 5?
> That seems clearly wrong, since we have not removed the left join.
>
>
> What you noticed before remove_useless_self_joins seems not right to me.
> The PHVs in placeholder_list are supposed to have empty phnullingrels by
> convention, as explained in find_placeholder_info.
>
> * By convention, phinfo->ph_var->phnullingrels is always empty, since the
> * PlaceHolderInfo represents the initially-calculated state of the
> * PlaceHolderVar.
>
> Much worse, if you look around elsewhere in the data structure,
> particularly at the processed_tlist, you find instances of the
> PlaceHolderVars that have not been updated and still have the
> old values such as ":phrels (b 6)".
>
>
> Good catch!  We fail to handle PlaceHolderVar case in sje_walker and
> replace_varno_walker, which is wrong.  Fixed in v3 patch.
>
> It's not apparent to me why the cross-checks we have in setrefs.c
> and elsewhere don't detect a problem with this.  But it seems
> clear that remove_useless_self_joins is still several bricks
> shy of a load in terms of fully updating the data structure for a
> join removal.  Probably with some more work to add complication
> to this test case, we could demonstrate an observable failure.
>
>
> Fair point.  I have the same concern that we still have other loose ends
> in SJE updating necessary data structures in self join removal.  And I
> noticed that Andres expressed the same concern in [1].  I think we
> should come up with some solution to detect/avoid such problems, but I
> imagine that that should be a seperate patch.

Agree. PlannerInfo is a hot structure where many new features try to
save some data. The history of this patch has shown, for example,
JoinDomains. Finding field jd_relids, which could contain the link to
the removing relation, was tricky.
I still have no idea how to check it internally except by attaching a
comment to the PlannerInfo structure.

--
regards,
Andrei Lepikhov
Postgres Professional

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2023-11-30 10:54:36 BUG #18218: NOT LIKE ANY returns same result as LIKE ANY when array items are wrapped into E''
Previous Message Sandeep Thakkar 2023-11-30 07:15:23 Re: BUG #18201: POSTGRESQL 16 - Failed to initialise the database cluster with initdb