Re: Improve rowcount estimate for UNNEST(column)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Improve rowcount estimate for UNNEST(column)
Date: 2023-11-26 20:22:04
Message-ID: 3699083.1701030124@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> writes:
> On Sat, 2023-11-25 at 09:19 -0800, Paul A Jungwirth wrote:
>> Here is a patch to improve rowcount estimates for
>> `UNNEST(some_array_column)`. Today we hard code this to 10, but we
>> have statistics about array size, so it's easy to use them.

> The idea sounds good to me.

I didn't read the patch either yet, but it seems like a reasonable idea.

> I didn't test or scrutinize the code, but I noticed that you use
> EXPLAIN in the regression tests. I think that makes the tests vulnerable
> to changes in the parameters or in the block size.

Yes, this regression test is entirely unacceptable; the numbers will
not be stable enough. Even aside from the different-settings issue,
you can't rely on ANALYZE deriving exactly the same stats every time.
Usually what we try to do is devise a query where the plan shape
changes because of the better estimate. That typically will provide
some insulation against small changes in the numerical estimates.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2023-11-26 22:41:51 Re: GUC names in messages
Previous Message Laurenz Albe 2023-11-26 20:11:59 Re: Improve rowcount estimate for UNNEST(column)