Re: Autovacuum on partitioned table (autoanalyze)

From: yuzuko <yuzukohosoya(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Amit Langote <amitlangote09(at)gmail(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>
Subject: Re: Autovacuum on partitioned table (autoanalyze)
Date: 2020-12-02 14:11:16
Message-ID: CAKkQ509GJpRHRSYkrvcAnLbO3KEs4=5Nbydpv54Gu2wOQtZPJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Alvaro,

Thank you for your comments.

>
> > In second thought about the reason for the "toprel_oid". It is perhaps
> > to avoid "wrongly" propagated values to ancestors after a manual
> > ANALYZE on a partitioned table. But the same happens after an
> > autoanalyze iteration if some of the ancestors of a leaf relation are
> > analyzed before the leaf relation in a autoanalyze iteration. That
> > can trigger an unnecessary analyzing for some of the ancestors.
>
> I'm not sure I understand this point. I think we should only trigger
> this on analyzes of *leaf* partitions, not intermediate partitioned
> relations. That way you would never get these unnecesary analyzes.
> Am I missing something?
>
> (So with my proposal in the previous email, we would send the list of
> ancestor relations after analyzing a leaf partition. Whenever we
> analyze a non-leaf, then the list of ancestors is sent as an empty
> list.)
>
The problem Horiguchi-san mentioned is as follows:

create table p1 (i int) partition by range(i);
create table p1_1 partition of p1 for values from (0) to (500)
partition by range(i);
create table p1_1_1 partition of p1_1 for values from (0) to (300);
insert into p1 select generate_series(0,299);

-- After auto analyze (first time)
postgres=# select relname, n_mod_since_analyze, last_autoanalyze from
pg_stat_all_tables where relname in ('p1','p1_1','p1_1_1');
relname | n_mod_since_analyze | last_autoanalyze
---------+---------------------+-------------------------------
p1 | 300 |
p1_1 | 300 |
p1_1_1 | 0 | 2020-12-02 22:24:18.753574+09
(3 rows)

-- Insert more rows
postgres=# insert into p1 select generate_series(0,199);
postgres=# select relname, n_mod_since_analyze, last_autoanalyze from
pg_stat_all_tables where relname in ('p1','p1_1','p1_1_1');
relname | n_mod_since_analyze | last_autoanalyze
---------+---------------------+-------------------------------
p1 | 300 |
p1_1 | 300 |
p1_1_1 | 200 | 2020-12-02 22:24:18.753574+09
(3 rows)

-- After auto analyze (second time)
postgres=# select relname, n_mod_since_analyze, last_autoanalyze from
pg_stat_all_tables where relname in ('p1','p1_1','p1_1_1');
relname | n_mod_since_analyze | last_autoanalyze
---------+---------------------+-------------------------------
p1 | 0 | 2020-12-02 22:25:18.857248+09
p1_1 | 200 | 2020-12-02 22:25:18.661932+09
p1_1_1 | 0 | 2020-12-02 22:25:18.792078+09

After 2nd auto analyze, all relations' n_mod_since_analyze should be 0,
but p1_1's is not. This is because p1_1 was analyzed before p1_1_1.
So p1_1 will be analyzed again in the 3rd auto analyze.
That is propagating changes_since_analyze to *all ancestors* may cause
unnecessary analyzes even if we do this only when analyzing leaf partitions.
So I think we should track ancestors which are not analyzed in the current
iteration as Horiguchi-san proposed.

Regarding your idea:
> typedef struct PgStat_MsgAnalyze
> {
> PgStat_MsgHdr m_hdr;
> Oid m_databaseid;
> Oid m_tableoid;
> bool m_autovacuum;
> bool m_resetcounter;
> TimestampTz m_analyzetime;
> PgStat_Counter m_live_tuples;
> PgStat_Counter m_dead_tuples;
> int m_nancestors;
> Oid m_ancestors[PGSTAT_NUM_ANCESTORENTRIES];
> } PgStat_MsgAnalyze;

I'm not sure but how about storing only ancestors that aren't analyzed
in the current
iteration in m_ancestors[PGSTAT_NUM_ANCESTORENTRIES] ?

> > > > Regarding the counters in pg_stat_all_tables: maybe some of these should be
> > > > null rather than zero ? Or else you should make an 0001 patch to fully
> > > > implement this view, with all relevant counters, not just n_mod_since_analyze,
> > > > last_*analyze, and *analyze_count. These are specifically misleading:
> > > >
> > > > last_vacuum |
> > > > last_autovacuum |
> > > > n_ins_since_vacuum | 0
> > > > vacuum_count | 0
> > > > autovacuum_count | 0
> > > >
> > > I haven't modified this part yet, but you meant that we should set
> > > null to counters
> > > about vacuum because partitioned tables are not vacuumed?
> >
> > Perhaps bacause partitioned tables *cannot* be vacuumed. I'm not sure
> > what is the best way here. Showing null seems reasonable but I'm not
> > sure that doesn't break anything.
>
> I agree that showing NULLs for the vacuum columns is better. Perhaps
> the most reasonable way to do this is use -1 as an indicator that NULL
> ought to be returned from pg_stat_get_vacuum_count() et al, and add a
> boolean in PgStat_TableCounts next to t_truncated, maybe "t_nullvacuum"
> that says to store -1 instead of 0 in pgstat_recv_tabstat.
>
Thank you for the advice. I'll fix it based on this idea.

--
Best regards,
Yuzuko Hosoya
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sergei Kornilov 2020-12-02 14:13:56 Re: pg_stat_statements oddity with track = all
Previous Message Alvaro Herrera 2020-12-02 14:04:45 Re: convert elog(LOG) calls to ereport