Re: document the need to analyze partitioned tables

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, yuzuko <yuzukohosoya(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: document the need to analyze partitioned tables
Date: 2023-01-18 21:23:41
Message-ID: Y8hjXaVPc9rSkBQw@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 18, 2023 at 10:15:18AM +0100, Laurenz Albe wrote:
> On Tue, 2023-01-17 at 16:16 -0500, Bruce Momjian wrote:
> > On Tue, Jan 17, 2023 at 03:00:50PM -0600, Justin Pryzby wrote:
> > > Maybe (all?) the clarification the docs need is to say:
> > > "Partitioned tables are not *themselves* processed by autovacuum."
> >
> > Yes, I think the lack of autovacuum needs to be specifically mentioned
> > since most people assume autovacuum handles _all_ statistics updating.
> >
> > Can someone summarize how bad it is we have no statistics on partitioned
> > tables?  It sounds bad to me.
>
> Andrey Lepikhov had an example earlier in this thread[1]. It doesn't take
> an exotic query.
>
> Attached is a new version of my patch that tries to improve the wording.

Ah, yes, that is the example I saw but could not re-find. Here is the
output:

CREATE TABLE test (id integer, val integer) PARTITION BY hash (id);

CREATE TABLE test_0 PARTITION OF test
FOR VALUES WITH (modulus 2, remainder 0);
CREATE TABLE test_1 PARTITION OF test
FOR VALUES WITH (modulus 2, remainder 1);

INSERT INTO test (SELECT q, q FROM generate_series(1,10) AS q);

VACUUM ANALYZE test;

INSERT INTO test (SELECT q, q%2 FROM generate_series(11,200) AS q);

VACUUM ANALYZE test_0,test_1;

EXPLAIN (ANALYZE, TIMING OFF, SUMMARY OFF)
SELECT * FROM test t1, test t2 WHERE t1.id = t2.val;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Hash Join (cost=7.50..15.25 rows=200 width=16) (actual rows=105 loops=1)
Hash Cond: (t1.id = t2.val)
-> Append (cost=0.00..5.00 rows=200 width=8) (actual rows=200 loops=1)
-> Seq Scan on test_0 t1_1 (cost=0.00..2.13 rows=113 width=8) (actual rows=113 loops=1)
-> Seq Scan on test_1 t1_2 (cost=0.00..1.87 rows=87 width=8) (actual rows=87 loops=1)
-> Hash (cost=5.00..5.00 rows=200 width=8) (actual rows=200 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 16kB
-> Append (cost=0.00..5.00 rows=200 width=8) (actual rows=200 loops=1)
-> Seq Scan on test_0 t2_1 (cost=0.00..2.13 rows=113 width=8) (actual rows=113 loops=1)
-> Seq Scan on test_1 t2_2 (cost=0.00..1.87 rows=87 width=8) (actual rows=87 loops=1)

VACUUM ANALYZE test;

EXPLAIN (ANALYZE, TIMING OFF, SUMMARY OFF)
SELECT * FROM test t1, test t2 WHERE t1.id = t2.val;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Hash Join (cost=7.50..15.25 rows=200 width=16) (actual rows=105 loops=1)
Hash Cond: (t2.val = t1.id)
-> Append (cost=0.00..5.00 rows=200 width=8) (actual rows=200 loops=1)
-> Seq Scan on test_0 t2_1 (cost=0.00..2.13 rows=113 width=8) (actual rows=113 loops=1)
-> Seq Scan on test_1 t2_2 (cost=0.00..1.87 rows=87 width=8) (actual rows=87 loops=1)
-> Hash (cost=5.00..5.00 rows=200 width=8) (actual rows=200 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 16kB
-> Append (cost=0.00..5.00 rows=200 width=8) (actual rows=200 loops=1)
-> Seq Scan on test_0 t1_1 (cost=0.00..2.13 rows=113 width=8) (actual rows=113 loops=1)
-> Seq Scan on test_1 t1_2 (cost=0.00..1.87 rows=87 width=8) (actual rows=87 loops=1)

I see the inner side uses 'val' in the first EXPLAIN and 'id' in the
second, and you are right that 'val' has mostly 0/1.

Is it possible to document when partition table statistics helps?

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sandro Santilli 2023-01-18 21:42:23 Re: Ability to reference other extensions by schema in extension scripts
Previous Message Tom Lane 2023-01-18 21:14:21 Re: Extracting cross-version-upgrade knowledge from buildfarm client