Re: num_sa_scans in genericcostestimate

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: num_sa_scans in genericcostestimate
Date: 2022-09-08 19:17:18
Message-ID: 4108679.1662664638@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> When costing a btree index scan, num_sa_scans gets computed twice, once in
> btcostestmeate and once in genericcostestimate. But the computations are
> different. It looks like the generic one includes all =ANY in any column
> in the index, while the bt one includes only =ANY which or on columns for
> which all the preceding index columns are tested for equality.

I think this is correct. As per the comments in btcostestimate:

* For a btree scan, only leading '=' quals plus inequality quals for the
* immediately next attribute contribute to index selectivity (these are
* the "boundary quals" that determine the starting and stopping points of
* the index scan). Additional quals can suppress visits to the heap, so
* it's OK to count them in indexSelectivity, but they should not count
* for estimating numIndexTuples. So we must examine the given indexquals
* to find out which ones count as boundary quals. ...

and further down

/* count number of SA scans induced by indexBoundQuals only */
if (alength > 1)
num_sa_scans *= alength;

This num_sa_scans value computed by btcostestimate is (or should be)
only used in calculations related to numIndexTuples, whereas the one
in genericcostestimate should be used for calculations related to the
overall number of heap tuples returned by the indexscan. Maybe there
is someplace that is using the wrong one, but it's not a bug that they
are different.

> The context for this is that I was looking at cases where btree indexes
> were not using all the columns they could, but rather shoving some of the
> conditions down into a Filter unnecessarily/unhelpfully. This change
> doesn't fix that, but it does seem to be moving in the right direction.

If it helps, it's only accidental, because this patch is surely wrong.
We *should* be distinguishing these numbers.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-09-08 19:33:40 Re: num_sa_scans in genericcostestimate
Previous Message Jacob Champion 2022-09-08 18:18:42 Re: [PoC] Let libpq reject unexpected authentication requests