Re: Autovacuum on partitioned table

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: yuzuko <yuzukohosoya(at)gmail(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>
Subject: Re: Autovacuum on partitioned table
Date: 2020-02-02 03:52:53
Message-ID: CA+fd4k4kSNr5KiANrW5D4zmzbwTU7ZVyC2Ok5bRnT77r1chQzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 29 Jan 2020 at 17:56, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>
> On Wed, Jan 29, 2020 at 11:29 AM yuzuko <yuzukohosoya(at)gmail(dot)com> wrote:
> > > Besides the complexity of
> > > getting that infrastructure in place, an important question is whether
> > > the current system of applying threshold and scale factor to
> > > changes_since_analyze should be used as-is for inheritance parents
> > > (partitioned tables), because if users set those parameters similarly
> > > to for regular tables, autovacuum might analyze partitioned tables
> > > more than necessary. We'll either need a different formula, or some
> > > commentary in the documentation about how partitioned tables might
> > > need different setting, or maybe both.
> > >
> > I'm not sure but I think we need new autovacuum parameters for
> > partitioned tables (autovacuum, autovacuum_analyze_threshold,
> > autovacuum_analyze_scale_factor) because whether it's necessary
> > to run autovacuum on partitioned tables will depend on users.
> > What do you think?
>
> Yes, we will need to first support those parameters on partitioned
> tables. Currently, you get:
>
> create table p (a int) partition by list (a) with
> (autovacuum_analyze_scale_factor=0);
> ERROR: unrecognized parameter "autovacuum_analyze_scale_factor"
>
> > > How are you going to track changes_since_analyze of partitioned table?
> > > It's just an idea but we can accumulate changes_since_analyze of
> > > partitioned table by adding child tables's value after analyzing each
> > > child table. And compare the partitioned tables value to the threshold
> > > that is computed by (autovacuum_analyze_threshold + total rows
> > > including all child tables * autovacuum_analyze_scale_factor).
> > >
> > The idea Sawada-san mentioned is similar to mine.
>
> So if I understand this idea correctly, a partitioned table's analyze
> will only be triggered when partitions are analyzed. That is,
> inserts, updates, deletes of tuples in partitions will be tracked by
> pgstat, which in turn is used by autovacuum to trigger analyze on
> partitions. Then, partitions changes_since_analyze is added into the
> parent's changes_since_analyze, which in turn *may* trigger analyze
> parent. I said "may", because it would take multiple partition
> analyzes to accumulate enough changes to trigger one on the parent.
> Am I getting that right?

Yeah that is what I meant. In addition, adding partition's
changes_since_analyze to its parent needs to be done recursively as
the parent table could also be a partitioned table.

>
> > Also, for tracking
> > changes_since_analyze, we have to make partitioned table's statistics.
> > To do that, we can invent a new PgStat_StatPartitionedTabEntry based
> > on PgStat_StatTabEntry. Through talking with Amit, I think the new structure
> > needs the following members:
> >
> > tableid
> > changes_since_analyze
> > analyze_timestamp
> > analyze_count
> > autovac_analyze_timestamp
> > autovac_analyze_count
> >
> > Vacuum doesn't run on partitioned tables, so I think members related to
> > (auto) vacuum need not be contained in the structure.
>
> On second thought, maybe we don't need a new PgStat_ struct. We can
> just use what's used for regular tables and leave the fields that
> don't make sense for partitioned tables set to 0, such as those that
> track the counts of scans, tuples, etc. That means we don't have to
> mess with interfaces of existing functions, like this one:
>
> static void relation_needs_vacanalyze(Oid relid,
> AutoVacOpts *relopts,
> Form_pg_class classForm,
> PgStat_StatTabEntry *tabentry, ...

+1

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2020-02-02 05:59:56 Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side
Previous Message Masahiko Sawada 2020-02-02 01:45:12 Re: error context for vacuum to include block number