Quick Links

Re: Planner join order regression from PG 15 to PG 16+: 70ms -> 1440ms (self-contained reproducer included)

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andrei Lepikhov <lepihov(at)gmail(dot)com>
Cc:	Mauro Gatti <mauro(dot)gt70(at)gmail(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject:	Re: Planner join order regression from PG 15 to PG 16+: 70ms -> 1440ms (self-contained reproducer included)
Date:	2026-03-05 21:44:38
Message-ID:	14531.1772747078@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Andrei Lepikhov <lepihov(at)gmail(dot)com> writes:
> On 5/3/26 17:25, Mauro Gatti wrote:
>> ## Questions for the community

> Thanks for stable reproduction!

Yes, we appreciate that much effort being put into trouble reports.
Makes it a lot easier to see what's going wrong.

> Your case is typical for 'never executed' nodes. As you can see, the
> costs of your query plans are very close, and the estimation error is
> large due to multiple clauses in your filter. As I see, for the planner,
> there is no difference in which version of the plan to choose - it is
> just a game of chance.

Right. Given the very-far-off rowcount estimates for some of the
index scans, it'd be surprising if the planner arrived at a good
join order. It's a "garbage in, garbage out" situation. As Andrei
suggested, you can often improve bad rowcount estimates by creating
custom statistics.

I found it was sufficient to do

=# create statistics on brand_id,line_code,model_year,model_code,version_code,pricelist_id from pricelist_options;
CREATE STATISTICS
=# analyze pricelist_options;
ANALYZE

That doesn't result in fully accurate estimates:

=# explain analyze select * from pricelist_options pl where ((pl.brand_id = 10) AND ((pl.line_code)::text = 'ABC'::text) AND ((pl.model_year)::text = 'YR'::text) AND ((pl.model_code)::text = 'ABC'::text) AND ((pl.version_code)::text = 'VER-001'::text) AND (pl.pricelist_id = 100));
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using ix_pricelist_options_1 on pricelist_options pl (cost=0.43..117.12 rows=28 width=106) (actual time=0.015..0.037 rows=165.00 loops=1)
Index Cond: ((brand_id = 10) AND ((line_code)::text = 'ABC'::text) AND ((model_year)::text = 'YR'::text) AND ((model_code)::text = 'ABC'::text) AND ((version_code)::text = 'VER-001'::text) AND (pricelist_id = 100))

but "28 rows" is a lot closer to 165 than "1 row", and it's enough
to push the planner to choose the plan you want.

I do concur with Andrei's recommendation to create stats matching
your other multicolumn indexes, though.

regards, tom lane

In response to

Re: Planner join order regression from PG 15 to PG 16+: 70ms -> 1440ms (self-contained reproducer included) at 2026-03-05 19:09:06 from Andrei Lepikhov

Responses

Re: Planner join order regression from PG 15 to PG 16+: 70ms -> 1440ms (self-contained reproducer included) at 2026-03-06 08:24:03 from Mauro Gatti

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Mauro Gatti	2026-03-06 08:24:03	Re: Planner join order regression from PG 15 to PG 16+: 70ms -> 1440ms (self-contained reproducer included)
Previous Message	Andrei Lepikhov	2026-03-05 19:09:06	Re: Planner join order regression from PG 15 to PG 16+: 70ms -> 1440ms (self-contained reproducer included)