An inefficient query caused by unnecessary PlaceHolderVar

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: An inefficient query caused by unnecessary PlaceHolderVar
Date: 2023-05-12 06:35:33
Message-ID: CAMbWs48uk6C7Z9m_FNT8_21CMCk68hrgAsz=z6zpP1PNZMkeoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I happened to notice that the query below can be inefficient.

# explain (costs off)
select * from
int8_tbl a left join
(int8_tbl b inner join
lateral (select *, b.q2 as x from int8_tbl c) ss on b.q2 = ss.q1)
on a.q1 = b.q1;
QUERY PLAN
------------------------------------
Hash Right Join
Hash Cond: (b.q1 = a.q1)
-> Nested Loop
-> Seq Scan on int8_tbl b
-> Seq Scan on int8_tbl c
Filter: (b.q2 = q1)
-> Hash
-> Seq Scan on int8_tbl a
(8 rows)

For B/C join, currently we only have one option, i.e., nestloop with
parameterized inner path. This could be extremely inefficient in some
cases, such as when C does not have any indexes, or when B is very
large. I believe the B/C join can actually be performed with hashjoin
or mergejoin here, as it is an inner join.

This happens because when we pull up the lateral subquery, we notice
that Var 'x' is a lateral reference to 'b.q2' which is outside the
subquery. So we wrap it in a PlaceHolderVar. This is necessary for
correctness if it is a lateral reference from nullable side to
non-nullable item. But here in this case, the referenced item is also
nullable, so actually we can omit the PlaceHolderVar with no harm. The
comment in pullup_replace_vars_callback() also explains this point.

* (Even then, we could omit the PlaceHolderVar if
* the referenced rel is under the same lowest outer join, but
* it doesn't seem worth the trouble to check that.)

All such PHVs would imply lateral dependencies which would make us have
no choice but nestloop. I think we should avoid such PHVs as much as
possible. So IMO it may 'worth the trouble to check that'.

Attached is a patch to check that for simple Vars. Maybe we can extend
it to avoid PHVs for more complex expressions, but that requires some
codes because for now we always wrap non-var expressions to PHVs in
order to have a place to insert nulling bitmap. As a first step, let's
do it for simple Vars only.

Any thoughts?

Thanks
Richard

Attachment Content-Type Size
v1-0001-Avoid-unnecessary-PlaceHolderVars-for-simple-Vars.patch application/octet-stream 14.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2023-05-12 06:40:19 Re: psql tests hangs
Previous Message shveta malik 2023-05-12 06:33:38 Re: Support logical replication of DDLs