RE: Poor row estimates from planner, stat `most_common_elems` sometimes missing for a text[] column

From: Mark Frost <FROSTMAR(at)uk(dot)ibm(dot)com>
To: "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: RE: Poor row estimates from planner, stat `most_common_elems` sometimes missing for a text[] column
Date: 2025-06-06 09:21:34
Message-ID: PH3PPF1C905D6E61EEB89CF46C1F572D402E16EA@PH3PPF1C905D6E6.namprd15.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-performance


> On 6/5/25 17:42, Mark Frost wrote:
> > Is there any good explanation for this behaviour? Preferably we’d like
> > some way for proper `most_common_elems` statistics to be collected in
> > our production database, in the hope that influences a good query plan
> > to always be selected.

> most_common_elems has a limited size, and if all the elements have the
> same freq, there's nothing we can do.

> You could do: alter table test alter column tags set statistics X;
> However, X is capped at 10000…

Actually *any* most_common_elems stats would be fine, because the reasoning is:

* If the searched element is in most_common_elems we know it’s frequency
* If it’s not, it’s less frequent than the least most_common_elems

So in our case when every row is unique, we’d only actually need stats to record a single most_common_elems (if only it would record one)

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: Building C, IBM Hursley Office, Hursley Park Road, Winchester, Hampshire SO21 2JN

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2025-06-07 01:29:54 Re: Poor row estimates from planner, stat `most_common_elems` sometimes missing for a text[] column
Previous Message Frédéric Yhuel 2025-06-06 08:07:38 Re: Poor row estimates from planner, stat `most_common_elems` sometimes missing for a text[] column