Re: Postgres picks suboptimal index after building of an extended statistics

From: Andres Freund <andres(at)anarazel(dot)de>
To: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Postgres picks suboptimal index after building of an extended statistics
Date: 2022-11-22 17:14:50
Message-ID: 20221122171450.adypdazxs3gkvbzv@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-07-11 12:57:36 +0500, Andrey Lepikhov wrote:
> On 7/8/22 03:07, Tom Lane wrote:
> > Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru> writes:
> > > On 12/8/21 04:26, Tomas Vondra wrote:
> > > > I wonder if we should teach clauselist_selectivity about UNIQUE indexes,
> > > > and improve the cardinality estimates directly, not just costing for
> > > > index scans.
> >
> > > I tried to implement this in different ways. But it causes additional
> > > overhead and code complexity - analyzing a list of indexes and match
> > > clauses of each index with input clauses in each selectivity estimation.
> > > I don't like that way and propose a new patch in attachment.
> >
> > I looked at this briefly. I do not think that messing with
> > btcostestimate/genericcostestimate is the right response at all.
> > The problem can be demonstrated with no index whatever, as in the
> > attached shortened version of the original example. I get
>
> I partly agree with you. Yes, I see the problem too. But also we have a
> problem that I described above: optimizer don't choose a path with minimal
> selectivity from a set selectivities which shows cardinality less than 1
> (see badestimate2.sql).
> New patch (see in attachment), fixes this problem.

This causes the mains regression tests to fail due to a planner change:

https://cirrus-ci.com/build/6680222884429824

diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/join.out /tmp/cirrus-ci-build/build/testrun/regress/regress/results/join.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/join.out 2022-11-22 12:27:18.852087140 +0000
+++ /tmp/cirrus-ci-build/build/testrun/regress/regress/results/join.out 2022-11-22 12:28:47.934938882 +0000
@@ -6671,10 +6671,9 @@
Merge Cond: (j1.id1 = j2.id1)
Join Filter: (j2.id2 = j1.id2)
-> Index Scan using j1_id1_idx on j1
- -> Index Only Scan using j2_pkey on j2
+ -> Index Scan using j2_id1_idx on j2
Index Cond: (id1 >= ANY ('{1,5}'::integer[]))
- Filter: ((id1 % 1000) = 1)
-(7 rows)
+(6 rows)

select * from j1
inner join j2 on j1.id1 = j2.id1 and j1.id2 = j2.id2

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2022-11-22 17:16:59 Re: Allow single table VACUUM in transaction block
Previous Message Andres Freund 2022-11-22 17:10:00 Re: [PoC] Improve dead tuple storage for lazy vacuum