Re: Autovacuum on partitioned table

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: yuzuko <yuzukohosoya(at)gmail(dot)com>
Cc: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>
Subject: Re: Autovacuum on partitioned table
Date: 2020-01-29 08:56:40
Message-ID: CA+HiwqFcFh=6PDn-qg=wcj0QEJ+MP6h1gEbKjGc8RYeMM-_DOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 29, 2020 at 11:29 AM yuzuko <yuzukohosoya(at)gmail(dot)com> wrote:
> > Besides the complexity of
> > getting that infrastructure in place, an important question is whether
> > the current system of applying threshold and scale factor to
> > changes_since_analyze should be used as-is for inheritance parents
> > (partitioned tables), because if users set those parameters similarly
> > to for regular tables, autovacuum might analyze partitioned tables
> > more than necessary. We'll either need a different formula, or some
> > commentary in the documentation about how partitioned tables might
> > need different setting, or maybe both.
> >
> I'm not sure but I think we need new autovacuum parameters for
> partitioned tables (autovacuum, autovacuum_analyze_threshold,
> autovacuum_analyze_scale_factor) because whether it's necessary
> to run autovacuum on partitioned tables will depend on users.
> What do you think?

Yes, we will need to first support those parameters on partitioned
tables. Currently, you get:

create table p (a int) partition by list (a) with
(autovacuum_analyze_scale_factor=0);
ERROR: unrecognized parameter "autovacuum_analyze_scale_factor"

> > How are you going to track changes_since_analyze of partitioned table?
> > It's just an idea but we can accumulate changes_since_analyze of
> > partitioned table by adding child tables's value after analyzing each
> > child table. And compare the partitioned tables value to the threshold
> > that is computed by (autovacuum_analyze_threshold + total rows
> > including all child tables * autovacuum_analyze_scale_factor).
> >
> The idea Sawada-san mentioned is similar to mine.

So if I understand this idea correctly, a partitioned table's analyze
will only be triggered when partitions are analyzed. That is,
inserts, updates, deletes of tuples in partitions will be tracked by
pgstat, which in turn is used by autovacuum to trigger analyze on
partitions. Then, partitions changes_since_analyze is added into the
parent's changes_since_analyze, which in turn *may* trigger analyze
parent. I said "may", because it would take multiple partition
analyzes to accumulate enough changes to trigger one on the parent.
Am I getting that right?

> Also, for tracking
> changes_since_analyze, we have to make partitioned table's statistics.
> To do that, we can invent a new PgStat_StatPartitionedTabEntry based
> on PgStat_StatTabEntry. Through talking with Amit, I think the new structure
> needs the following members:
>
> tableid
> changes_since_analyze
> analyze_timestamp
> analyze_count
> autovac_analyze_timestamp
> autovac_analyze_count
>
> Vacuum doesn't run on partitioned tables, so I think members related to
> (auto) vacuum need not be contained in the structure.

On second thought, maybe we don't need a new PgStat_ struct. We can
just use what's used for regular tables and leave the fields that
don't make sense for partitioned tables set to 0, such as those that
track the counts of scans, tuples, etc. That means we don't have to
mess with interfaces of existing functions, like this one:

static void relation_needs_vacanalyze(Oid relid,
AutoVacOpts *relopts,
Form_pg_class classForm,
PgStat_StatTabEntry *tabentry, ...

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Charsley 2020-01-29 09:41:02 Data race in interfaces/libpq/fe-exec.c
Previous Message Masahiko Sawada 2020-01-29 08:42:56 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)