Re: Poor row estimates from planner, stat `most_common_elems` sometimes missing for a text[] column

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Frost <FROSTMAR(at)uk(dot)ibm(dot)com>
Cc: "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Poor row estimates from planner, stat `most_common_elems` sometimes missing for a text[] column
Date: 2025-06-05 18:57:10
Message-ID: 1311236.1749149830@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Mark Frost <FROSTMAR(at)uk(dot)ibm(dot)com> writes:
> We're seeing intermittently very poor performance of a query, when occasionally a poor query plan is chosen. We're using Postgres 16.9.
> One suspicious factor when looking at the EXPLAIN ANALYZE output, is a very wrong estimated number of rows to be returned from a text[] column queried with '&&'.
> After playing around with a simple recreate (details below), it seems ANALYZE of the table is affected by the number of rows in the table. Statistic `most_common_elems` is [null] when there's over 15,873 rows in the table when analyzed. With fewer rows it's analyzed correctly.

Thanks for the report. Poking through the code, it seems like
there are two distinct problems here:

1. ANALYZE uses a "lossy counting" algorithm (dating to commit
0e5e167aa) to estimate the frequencies of array element values.
The part of that that seems to be going off the rails is
this selection of a cutoff frequency below which element values
will be dropped:

/*
* Construct an array of the interesting hashtable items, that is,
* those meeting the cutoff frequency (s - epsilon)*N. Also identify
* the minimum and maximum frequencies among these items.
*
* Since epsilon = s/10 and bucket_width = 1/epsilon, the cutoff
* frequency is 9*N / bucket_width.
*/
cutoff_freq = 9 * element_no / bucket_width;

The first thing I find suspicious here is that the calculation is
based on element_no (the total number of array elements processed)
and not nonnull_cnt (the maximum possible frequency). Is that
really right? It wouldn't change the results in your reproducer
with just one element per array, but it seems bogus.

More relevant to your immediate problem, this creates a behavioral
cliff at the point where cutoff_freq rises from 0 to 1, which with the
default attstattarget turns out to be, you guessed it, 15873 elements.
In your example, all the element values have frequency 1, so that
switches us from being willing to record all the values to being
willing to record none of them. That doesn't feel right either.
By analogy to our treatment of regular MCVs, it seems like we should
never be willing to store values that we didn't see at least twice.
Of course, doing that would make this example worse, because then
we'd not store any "most common" elements at smaller rowcounts either.
Which brings us to the other major problem this reveals:

2. In array_selfuncs.c, we fall back to default selectivity
estimates if there's no MCELEM statistics field. What we
should be doing, if there are other stats for the column but
not MCELEM, is realizing that compute_array_stats did not find
any elements that are common enough to be worth recording.
Then we'd use some much lower-than-default estimate for the
selectivity.

I don't have any immediate patch to offer, but clearly this
area could use another look.

compute_array_stats seems to have borrowed the lossy-counting
algorithm from ts_typanalyze, so we'd better take a look at
that too.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2025-06-05 21:52:53 Re: Poor row estimates from planner, stat `most_common_elems` sometimes missing for a text[] column
Previous Message Mahdi Bahrami 2025-06-05 18:25:04 Re: Database creation performance drop going from pg 14 to pg 15+