Re: document the need to analyze partitioned tables

From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, yuzuko <yuzukohosoya(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: document the need to analyze partitioned tables
Date: 2022-10-06 06:02:07
Message-ID: 3df5c68b-13aa-53d0-c0ec-ed98e6972e2e@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/5/22 13:37, Laurenz Albe wrote:
> On Mon, 2022-03-28 at 15:05 +0200, Tomas Vondra wrote:
>> I've pushed the last version, and backpatched it to 10 (not sure I'd
>> call it a bugfix, but I certainly agree with Justin it's worth
>> mentioning in the docs, even on older branches).
>
> I'd like to suggest an improvement to this. The current wording could
> be read to mean that dead tuples won't get cleaned up in partitioned tables.
>
>
> By the way, where are the statistics of a partitioned tables used? The actual
> tables scanned are always the partitions, and in the execution plans that
> I have seen, the optimizer always used the statistics of the partitions.
For example, it is used to estimate selectivity of join clause:

CREATE TABLE test (id integer, val integer) PARTITION BY hash (id);
CREATE TABLE test_0 PARTITION OF test
FOR VALUES WITH (modulus 2, remainder 0);
CREATE TABLE test_1 PARTITION OF test
FOR VALUES WITH (modulus 2, remainder 1);

INSERT INTO test (SELECT q, q FROM generate_series(1,10) AS q);
VACUUM ANALYZE test;
INSERT INTO test (SELECT q, q%2 FROM generate_series(11,200) AS q);
VACUUM ANALYZE test_0,test_1;

EXPLAIN (ANALYZE, TIMING OFF, SUMMARY OFF)
SELECT * FROM test t1, test t2 WHERE t1.id = t2.val;
VACUUM ANALYZE test;
EXPLAIN (ANALYZE, TIMING OFF, SUMMARY OFF)
SELECT * FROM test t1, test t2 WHERE t1.id = t2.val;

Here without actual statistics on parent table we make wrong prediction.

--
Regards
Andrey Lepikhov
Postgres Professional

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-10-06 06:06:11 Re: Add last failed connection error message to pg_stat_wal_receiver
Previous Message Kyotaro Horiguchi 2022-10-06 05:52:06 Re: pg_statsinfo - PG15 support?