Re: Autovacuum on partitioned table

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: yuzuko <yuzukohosoya(at)gmail(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>
Subject: Re: Autovacuum on partitioned table
Date: 2020-02-03 02:36:55
Message-ID: CA+HiwqFx5S95v1h=bsKa150zdVD60ySt5DV+evN7eKEHDiUoeA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 2, 2020 at 12:53 PM Masahiko Sawada
<masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> On Wed, 29 Jan 2020 at 17:56, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > On Wed, Jan 29, 2020 at 11:29 AM yuzuko <yuzukohosoya(at)gmail(dot)com> wrote:
> > > > How are you going to track changes_since_analyze of partitioned table?
> > > > It's just an idea but we can accumulate changes_since_analyze of
> > > > partitioned table by adding child tables's value after analyzing each
> > > > child table. And compare the partitioned tables value to the threshold
> > > > that is computed by (autovacuum_analyze_threshold + total rows
> > > > including all child tables * autovacuum_analyze_scale_factor).
> > > >
> > > The idea Sawada-san mentioned is similar to mine.
> >
> > So if I understand this idea correctly, a partitioned table's analyze
> > will only be triggered when partitions are analyzed. That is,
> > inserts, updates, deletes of tuples in partitions will be tracked by
> > pgstat, which in turn is used by autovacuum to trigger analyze on
> > partitions. Then, partitions changes_since_analyze is added into the
> > parent's changes_since_analyze, which in turn *may* trigger analyze
> > parent. I said "may", because it would take multiple partition
> > analyzes to accumulate enough changes to trigger one on the parent.
> > Am I getting that right?
>
> Yeah that is what I meant. In addition, adding partition's
> changes_since_analyze to its parent needs to be done recursively as
> the parent table could also be a partitioned table.

That's a good point. So, changes_since_analyze increments are
essentially propagated from leaf partitions to all the way up to the
root table, including any intermediate partitioned tables. We'll need
to consider whether we should propagate only one level at a time (from
bottom of the tree) or update all parents up to the root, every time a
leaf partition is analyzed. If we we do the latter, that might end up
triggering analyze on all the parents at the same time causing
repeated scanning of the same child tables in close intervals,
although setting analyze threshold and scale factor of the parent
tables of respective levels wisely can help avoid any negative impact
of that.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Travers 2020-02-03 02:37:01 Re: Internal key management system
Previous Message Tom Lane 2020-02-03 02:14:15 Re: Portal->commandTag as an enum