Re: Do not scan index in right table if condition for left join evaluates to false using columns in left table

From: Andrei Lepikhov <lepihov(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Илья Жарков <izharkov1243(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, eng eng <pspetrov91(at)gmail(dot)com>
Subject: Re: Do not scan index in right table if condition for left join evaluates to false using columns in left table
Date: 2026-06-30 08:20:52
Message-ID: e419bafa-cfed-4324-89ad-9b2978fff512@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27/05/2026 12:11, Andrei Lepikhov wrote:
> Here is an updated patch that outlines the solution. It still needs to fix some
> issues, benchmarking, and write more tests, but the results so far look
> promising.

During production-like tests, a clear regression was found. Look:

EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF, BUFFERS OFF, SUMMARY OFF)
SELECT count(*) FROM onek t1 LEFT JOIN int4_tbl t2
ON (t1.unique1 = t2.f1 AND t1.hundred IN (1, 2));

Aggregate (actual rows=1.00 loops=1)
-> Nested Loop Left Join (actual rows=1000.00 loops=1)
Join Filter: (t1.unique1 = t2.f1)
Rows Removed by Join Filter: 1000
-> Seq Scan on onek t1 (actual rows=1000.00 loops=1)
-> Result (actual rows=1.00 loops=1000)
One-Time Filter: (t1.hundred = ANY ('{1,2}'::integer[]))
-> Materialize (actual rows=5.00 loops=200)
-> Seq Scan on int4_tbl t2 (actual rows=5.00 loops=200)

Table int4_tbl in this example is rescanned on each rescan of the Materialize
node. The key here is a REWIND flag that NestLoop should set for the inner side
to signal that the subtree is prepared to be rescanned. Without any parameters
inside, materialize node should prepare a tuplestore and skip rescanning of the
underlying subtree. Gating clause adds a parameter to the NestLoop, even if it
doesn't go any deep to the inner subtree.
In this example, 'gating' Result node contains a parameter 't1.hundred' that
signals NestLoop that the inner side may need rebuilding on rescan. So, a gated
NestLoop is a little different animal that lies between parameterised and
non-parameterised NL and deserves to be separated from both somehow.

The solution is quite straightforward: add to the NestLoop struct a flag
'keep_inner_rewind' that is defined as:

keep_inner_rewind =
!bms_overlap(PATH_REQ_OUTER(best_path->jpath.innerjoinpath), outerrelids);

With such a flag, ExecInitNestLoop() can decide whether the inner side itself is
parameterised and set the REWIND flag correctly.

Side story here is a system-dependent output in EXPLAIN. This long-lasting (for
extension tests) problem can be solved with a single function that accumulates
masking tricks across all system-dependent output. Right now, regression tests
have the following:

explain_analyze, explain_analyze_without_memory, explain_mask_costs,
explain_memoize, explain_merge, explain_parallel_append,
explain_parallel_sort_stats, ...

Maybe I missed something.

It is obvious that memory, costs and stats should be filtered out in regression
tests for the sake of stability. So, there is no reason to keep multiple
functions - they may be unified and accessed from everywhere.

See the new version of the patch attached. I treat this version still as a 'draft'.

--
regards, Andrei Lepikhov,
pgEdge

Attachment Content-Type Size
v0-0001-Gate-the-NestLoop-inner-side-with-outer-only-join.patch text/plain 52.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Previous Message Tomas Vondra 2026-06-30 08:10:13 Re: occasional ECPG failures on dikkop (FreeBSD)