Re: Improve rowcount estimate for UNNEST(column)

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Improve rowcount estimate for UNNEST(column)
Date: 2023-11-30 04:35:14
Message-ID: CACJufxFD_jM3dOORhPr9zpJSjbYerMsgwt_ydQAZqL1=kUvPCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 27, 2023 at 3:05 PM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
>
> Hi.
> Since both array_op_test, arrest both are not dropped at the end of
> src/test/regress/sql/arrays.sql.
> I found using table array_op_test test more convincing.
>
> select
> reltuples * 10 as original,
> reltuples * (select
> floor(elem_count_histogram[array_length(elem_count_histogram,1)])
> from pg_stats
> where tablename = 'array_op_test' and attname = 'i')
> as with_patch
> ,(select (elem_count_histogram[array_length(elem_count_histogram,1)])
> from pg_stats
> where tablename = 'array_op_test' and attname = 'i')
> as elem_count_histogram_last_element
> from pg_class where relname = 'array_op_test';
> original | with_patch | elem_count_histogram_last_element
> ----------+------------+-----------------------------------
> 1030 | 412 | 4.7843137
> (1 row)
>
> without patch:
> explain select unnest(i) from array_op_test;
> QUERY PLAN
> ----------------------------------------------------------------------
> ProjectSet (cost=0.00..9.95 rows=1030 width=4)
> -> Seq Scan on array_op_test (cost=0.00..4.03 rows=103 width=40)
> (2 rows)
>
> with patch:
> explain select unnest(i) from array_op_test;
> QUERY PLAN
> ----------------------------------------------------------------------
> ProjectSet (cost=0.00..6.86 rows=412 width=4)
> -> Seq Scan on array_op_test (cost=0.00..4.03 rows=103 width=40)
> (2 rows)
> --------

Hi.
I did a minor change. change estimate_array_length return type to
double, cost_tidscan function inside `int ntuples` to `double
ntuples`.

`clamp_row_est(get_function_rows(root, expr->funcid, clause));` will
round 4.7843137 to 5.
so with your patch and my refactor, the rows will be 103 * 5 = 515.

explain select unnest(i) from array_op_test;
QUERY PLAN
----------------------------------------------------------------------
ProjectSet (cost=0.00..7.38 rows=515 width=4)
-> Seq Scan on array_op_test (cost=0.00..4.03 rows=103 width=40)
(2 rows)

Attachment Content-Type Size
v1-0001-minor-refactor.patch text/x-patch 5.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2023-11-30 04:42:26 Re: patch: improve "user mapping not found" error message
Previous Message Yuya Watari 2023-11-30 04:18:57 Re: [PoC] Reducing planning time when tables have many partitions