Re: BUG #16521: n_distinct_inherited does not affect child partitions when set on main partition

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: klaudie(dot)willis(at)protonmail(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16521: n_distinct_inherited does not affect child partitions when set on main partition
Date: 2020-06-30 22:56:31
Message-ID: CAApHDvqp_LU5kb+3933aTCbq+UP8EiJ5uU8Qw2AuY+2r-Gt_4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, 1 Jul 2020 at 08:43, PG Bug reporting form
<noreply(at)postgresql(dot)org> wrote:
> Bug reference: 16521
> Logged by: klaudie willis
> Email address: klaudie(dot)willis(at)protonmail(dot)com
> PostgreSQL version: 13beta1
> Operating system: Windows
> Description:
>
> From documentation: "while n_distinct_inherited affects the statistics
> gathered for the table plus its inheritance children"
>
> However, as my own experience has shown me, and as illustrated here:
> https://www.db-fiddle.com/f/d2ckNd9eyrGAwvdkr8H5i/0 I cannot see that
> "_n_distinct_inherited" affects child partitions when set on the main
> partition table. Am I missing something really simple here?

This is not a bug. n_distinct_inherited was invented for table
inheritance to allow the n_distinct value to be overwritten when the
inherited tables are also being queried.

Consider the difference between:

create table t (a int);
create table t1 (a int) inherits(t);
insert into t select generate_Series(1,2000);
alter table t alter column a set (n_distinct_inherited = 1000);
alter table t alter column a set (n_distinct = 123);
analyze t;
explain select distinct a from t;
QUERY PLAN
----------------------------------------------------------------------
HashAggregate (cost=98.63..108.63 rows=1000 width=4)
Group Key: t.a
-> Append (cost=0.00..87.25 rows=4550 width=4)
-> Seq Scan on t t_1 (cost=0.00..29.00 rows=2000 width=4)
-> Seq Scan on t1 t_2 (cost=0.00..35.50 rows=2550 width=4)
(5 rows)

explain select distinct a from only t;
QUERY PLAN
-----------------------------------------------------------
HashAggregate (cost=34.00..35.23 rows=123 width=4)
Group Key: a
-> Seq Scan on t (cost=0.00..29.00 rows=2000 width=4)
(3 rows)

Notice the HashAggregate's row estimate in each case. The inherited
version applies when the inherited tables are also being queried.

The documents [1] mention: " n_distinct affects the statistics for the
table itself, while n_distinct_inherited affects the statistics
gathered for the table plus its inheritance children.". I think that
explains this behaviour. Perhaps you just assumed what it does based
on the name of the option?

Declarative partitioning reuses much of how table inheritance works,
although, much of that has been getting replaced since declarative
partitioning was added in PG10. Since there's not much sense in
performing an FROM ONLY on a declarative partitioned table, then
perhaps there's some room to have these behave differently for
partitioned tables, but changing their meaning now might confuse
people who are used to what they currently already do.

David

[1] https://www.postgresql.org/docs/12/sql-altertable.html

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Reiner Peterke 2020-07-01 07:25:57 Re: BUG #16516: when testing jit get terminate called after throwing an instance of 'std::bad_function_call'
Previous Message PG Bug reporting form 2020-06-30 20:02:54 BUG #16521: n_distinct_inherited does not affect child partitions when set on main partition