Re: document the need to analyze partitioned tables

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, yuzuko <yuzukohosoya(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: document the need to analyze partitioned tables
Date: 2023-01-17 21:00:50
Message-ID: 20230117210050.GX9837@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 17, 2023 at 03:53:24PM -0500, Bruce Momjian wrote:
> On Thu, Jan 12, 2023 at 03:27:47PM -0800, Nathan Bossart wrote:
> > On Wed, Oct 05, 2022 at 10:37:01AM +0200, Laurenz Albe wrote:
> > > On Mon, 2022-03-28 at 15:05 +0200, Tomas Vondra wrote:
> > >> I've pushed the last version, and backpatched it to 10 (not sure I'd
> > >> call it a bugfix, but I certainly agree with Justin it's worth
> > >> mentioning in the docs, even on older branches).
> > >
> > > I'd like to suggest an improvement to this. The current wording could
> > > be read to mean that dead tuples won't get cleaned up in partitioned tables.
> >
> > Well, dead tuples won't get cleaned up in partitioned tables, as
> > partitioned tables do not have storage. But I see what you mean. Readers
> > might misinterpret this to mean that autovacuum will not process the
> > partitions. There's a good definition of what the docs mean by
> > "partitioned table" [0], but FWIW it took me some time before I
> > consistently read "partitioned table" to mean "only the thing with relkind
> > set to 'p'" and not "both the partitioned table and its partitions." So,
> > while the current wording it technically correct, I think it'd be
> > reasonable to expand it to help avoid confusion.
> >
> > Here is my take on the wording:
> >
> > Since all the data for a partitioned table is stored in its partitions,
> > autovacuum does not process partitioned tables. Instead, autovacuum
> > processes the individual partitions that are regular tables. This
> > means that autovacuum only gathers statistics for the regular tables
> > that serve as partitions and not for the partitioned tables. Since
> > queries may rely on a partitioned table's statistics, you should
> > collect statistics via the ANALYZE command when it is first populated,
> > and again whenever the distribution of data in its partitions changes
> > significantly.
>
> Uh, what about autovacuum's handling of partitioned tables? This makes
> it sound like it ignores them because it talks about manual ANALYZE.

If we're referring to the *partitioned* table, then it does ignore them.
See:

|commit 6f8127b7390119c21479f5ce495b7d2168930e82
|Author: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
|Date: Mon Aug 16 17:27:52 2021 -0400
|
| Revert analyze support for partitioned tables

Maybe (all?) the clarification the docs need is to say:
"Partitioned tables are not *themselves* processed by autovacuum."

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-01-17 21:12:37 Re: Extracting cross-version-upgrade knowledge from buildfarm client
Previous Message Justin Pryzby 2023-01-17 20:59:04 Re: Progress report of CREATE INDEX for nested partitioned tables