Re: Bitmap scan is undercosted? - overestimated correlation and cost_index

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Vitaliy Garnashevich <vgarnashevich(at)gmail(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Bitmap scan is undercosted? - overestimated correlation and cost_index
Date: 2017-12-06 21:46:52
Message-ID: 20171206214652.GA13889@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Tue, Dec 05, 2017 at 01:50:11PM -0500, Tom Lane wrote:
> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> > On Dec 3, 2017 15:31, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> >>> But I do see that ties within the logical order of the column values are
> >>> broken to agree with the physical order. That is wrong, right? Is there
> >>> any argument that this is desirable?
>
> >> Uh ... what do you propose doing instead? We'd have to do something with
> >> ties, and it's not so obvious this way is wrong.
>
> > Let them be tied.
...
> I thought some more about this. What we really want the correlation stat
> to do is help us estimate how randomly an index-ordered scan will access
> the heap. If the values we've sampled are all unequal then there's no
> particular issue. However, if we have some group of equal values, we
> do not really know what order an indexscan will visit them in. The
> existing correlation calculation is making the *most optimistic possible*
> assumption, that such a group will be visited exactly in heap order ---
> and that assumption isn't too defensible.

I'm interested in discusstion regarding bitmap cost, since it would have helped
our case discussed here ~18 months ago:
https://www.postgresql.org/message-id/flat/20160524173914(dot)GA11880%40telsasoft(dot)com#20160524173914(dot)GA11880(at)telsasoft(dot)com

...but remember: in Vitaliy's case (as opposed to mine), the index scan is
*faster* but being estimated at higher cost than bitmap (I have to keep
reminding myself). So the rest of this discussion is about the
overly-optimistic cost estimate of index scans, which moves in the opposite
direction for this reported problem. For the test cases I looked at, index
scans were used when RPC=1 and redundant conditions were avoided, so I'm not
sure if there's any remaining issue (but I haven't looked at the latest cases
Vitaliy sent).

> In any case, given that we do this calculation without regard
> to any specific index,

One solution is to compute stats (at least correlation) for all indices, not
just expr inds. I did that earlier this year while throwing around/out ideas.
https://www.postgresql.org/message-id/20170707234119.GN17566%40telsasoft.com

> We do have an idea, from the data we have, whether the duplicates are close
> together in the heap or spread all over.

I think you just mean pg_stats.correlation for all values, not just duplicates
(with the understanding that duplicates might be a large fraction of the
tuples, and high weight in correlation).

Another issue I noted in an earlier thread is that as table sizes increase, the
existing correlation computation approaches 1 for correlated insertions, (like
"append-only" timestamps clustered around now()), due to ANALYZE sampling a
fraction of the table, and thereby representing only large-scale correlation,
and, to an increasing degree, failing to represent small-scale variations
between adjacent index TIDs, which has real cost (and for which the mitigation
by cache effects probably decreases WRT table size, too). I think any solution
needs to handle this somehow.

Generated data demonstrating this (I reused this query so it's more complicated
than it needs to be):

[pryzbyj(at)database ~]$ time for sz in 9999{,9{,9{,9{,9}}}} ; do psql postgres -tc "DROP TABLE IF EXISTS t; CREATE TABLE t(i float, j int); CREATE INDEX ON t(i);INSERT INTO t SELECT i/99999.0+pow(2,(-random())) FROM generate_series(1,$sz) i ORDER BY i; ANALYZE t; SELECT $sz, correlation, most_common_freqs[1] FROM pg_stats WHERE attname='i' AND tablename='t'"; done

9999 | 0.187146 |
99999 | 0.900629 |
999999 | 0.998772 |
9999999 | 0.999987 |

Trying to keep it all in my own head: For sufficiently large number of pages,
bitmap scan should be preferred to idx scan due to reduced random-page-cost
outweighing its overhead in CPU cost. Probably by penalizing index scans, not
discounting bitmap scans. Conceivably a correlation adjustment can be
conditionalized or weighted based on index_pages_fetched() ...
x = ln (x/999999);
if (x>1) correlation/=x;

I think one could look at the fraction of duplicated index keys expected to be
returned: if we expect to return 1000 tuples, with 200 duplicates from MCV,
cost_index would multiply correlation by (1 - 200/1000), meaning to use
something closer to max_IO_cost rather than min_IO_cost. I imagine it'd be
possible to limit to only those MCVs which pass quals - if none pass, then
there may be few tuples returned, so apply no correction to (additionally)
penalize index scan.

In my tests, at one point I implemented idx_corr_fudge(), returning a value
like "fragmentation" from pgstatindex (which I'm sure is where I got the phrase
when reporting the problem). That only uses the leaf nodes' "next" pointer,
and not the individual tuples, which probably works if there's a sufficiently
number of repeated keys.

I think that's all for now..

Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2017-12-06 21:58:58 Re: Postgres with pthread
Previous Message Tom Lane 2017-12-06 21:31:10 Re: pgsql: When VACUUM or ANALYZE skips a concurrently dropped table, log i

Browse pgsql-performance by date

  From Date Subject
Next Message Vitaliy Garnashevich 2017-12-07 00:17:13 Re: Bitmap scan is undercosted?
Previous Message Laurenz Albe 2017-12-06 08:20:38 Re: Different plan chosen when in lateral subquery