Re: Indexes on expressions with multiple columns and operators

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
Cc: Frédéric Yhuel <frederic(dot)yhuel(at)dalibo(dot)com>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>, Christophe Courtois <christophe(dot)courtois(at)dalibo(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Subject: Re: Indexes on expressions with multiple columns and operators
Date: 2025-09-18 16:59:11
Message-ID: 1975239.1758214751@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com> writes:
> On a fresh instance from HEAD with its default configuration, it shows:

> Index Scan using foo_s_idx on foo (cost=0.29..8.39 rows=33333 width=13)
> Index Cond: (s(crit, ackid) = true)

> It seems statistics shown in "pg_stats" view for function "s()" are good. The
> query itself even have the same costs than the query using the syntax tips you
> provide before.

> However, the estimated row number seems wrong in regard with the costs shown
> and statistics.

Yeah. The problem is that clause_selectivity_ext fails to consider
use of statistics if the clause looks like "bool_valued_function(...)".
If it looks like "bool_valued_function(...) = true", that goes down
a different code path that does the right thing.

Additional factors:

* If you just write "WHERE bool_valued_function(...) = true", that
gets stripped down to "WHERE bool_valued_function(...)" in the name
of making equivalent expressions look equivalent. (IS TRUE doesn't
get stripped, which is why you have to use that wording to avoid
that.)

* Index condition building puts back the "= true" in order to
construct something that satisfies the index AM API. And then it
uses that form to get a selectivity estimate for costing purposes
--- so the right number goes into the indexscan cost estimate.

* But the rowcount estimate is made on the form without "= true".
That's the number shown in EXPLAIN and used when considering
joins.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message David Rowley 2025-09-19 01:05:11 Re: Why isn't PG using an index-only scan?
Previous Message Ranier Vilela 2025-09-18 16:48:08 Re: Indexes on expressions with multiple columns and operators