Re: Subquery pull-up increases jointree search space

From: Andrei Lepikhov <lepihov(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Richard Guo <guofenglinux(at)gmail(dot)com>
Subject: Re: Subquery pull-up increases jointree search space
Date: 2026-05-13 15:11:18
Message-ID: 54aa2de9-dd13-4147-a847-6c095861d34b@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/05/2026 12:51, Andrei Lepikhov wrote:
> On 09/02/2026 23:34, Andrei Lepikhov wrote:
>> On 9/2/26 21:16, Tom Lane wrote:
>>> What I'm wondering about is that join_collapse_limit and
>>> from_collapse_limit were invented more than two decades ago, but
>>> we've not touched their default values since then.  Machines are a
>>> lot faster since 2004, and we've probably achieved some net speedups
>>> in the planner logic as well.  Could we alleviate this concern by
>>> raising those defaults, and if so, what are reasonable values in 2026?
>>
>> As I see, people never use the default settings now. The case that triggered
>> this topic could work well with a join collapse limit around 40 joins (GEQO
>> started at 14). But a specific setting always depends on how much time people
>> want to spend on planning. So, I don't think a change of default settings is
>> needed.
> After looking into more cases, I realized the main issue is actually something else.
> Looking at pull_up_sublinks_qual_recurse, I see that we have information to skip
> extra work if the join_collapse_limit is high enough.
>

And here is the patch.
The main idea is that after a transformation (EXISTS or ANY), we should not just
initialise the larg of the JoinExpr with the incoming join tree. Instead, we
need to look for minimal subtree. Since we already have relids referenced in the
subselect, we can traverse this subtree to find the smallest safe join tree that
contains these relids. Then, we initialise the larg of our SEMI JOIN JoinExpr
with this subtree and update the larg of the upper JoinExpr to point to our
SemiJoin.

Such behaviour is quite close to the not-pulled-up variant when a subselect is
used as a filter and pushed down to the minimum level needed to evaluate this
SubLink. So, delay the influence of join_collapse_limit as much as possible.

The patch provided contains implementation of the idea - the key function is
insert_pulled_up_sublink_join. It is a little polished with Claude AI - it added
tests to detect any issues during rebase onto master and documentation.

I'll CC Richard, since he often challenges the transformation machinery and may
be interested in helping to stabilise the user experience after the upgrade.

--
regards, Andrei Lepikhov,
pgEdge

Attachment Content-Type Size
v0-0001-Push-pulled-up-SEMI-ANTI-joins-next-to-their-refe.patch text/plain 38.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2026-05-13 15:15:31 Re: Is there value in having optimizer stats for joins/foreignkeys?
Previous Message Zsolt Parragi 2026-05-13 14:57:29 Re: [PATCH] Rebuild CHECK constraints after generated column SET EXPRESSION