Re: Autovacuum on partitioned table (autoanalyze)

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: yuzuko <yuzukohosoya(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Amit Langote <amitlangote09(at)gmail(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>
Subject: Re: Autovacuum on partitioned table (autoanalyze)
Date: 2021-04-08 15:27:57
Message-ID: 20210408152757.GA2691@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021-Apr-08, Tomas Vondra wrote:

> On 4/8/21 5:22 AM, Alvaro Herrera wrote:

> > However, I just noticed there is a huge problem, which is that the new
> > code in relation_needs_vacanalyze() is doing find_all_inheritors(), and
> > we don't necessarily have a snapshot that lets us do that. While adding
> > a snapshot acquisition at that spot is a very easy fix, I hesitate to
> > fix it that way, because the whole idea there seems quite wasteful: we
> > have to look up, open and lock every single partition, on every single
> > autovacuum iteration through the database. That seems bad. I'm
> > inclined to think that a better idea may be to store reltuples for the
> > partitioned table in pg_class.reltuples, instead of having to add up the
> > reltuples of each partition. I haven't checked if this is likely to
> > break anything.
>
> How would that value get updated, for the parent?

Same as for any other relation: ANALYZE would set it, after it's done
scanning the table. We would to make sure that nothing resets it to
empty, though, and that it doesn't cause issues elsewhere. (The patch I
sent contains the minimal change to make it work, but of course that's
missing having other pieces of code maintain it.)

> > (Also, a minor buglet: if we do ANALYZE (col1), then ANALYZE (col2) a
> > partition, then we repeatedly propagate the counts to the parent table,
> > so we would cause the parent to be analyzed more times than it should.
> > Sounds like we should not send the ancestor list when a column list is
> > given to manual analyze. I haven't verified this, however.)
>
> Are you sure? I haven't tried, but shouldn't this be prevented by only
> sending the delta between the current and last reported value?

I did try, and yes it behaves as you say.

--
Álvaro Herrera Valdivia, Chile
Bob [Floyd] used to say that he was planning to get a Ph.D. by the "green
stamp method," namely by saving envelopes addressed to him as 'Dr. Floyd'.
After collecting 500 such letters, he mused, a university somewhere in
Arizona would probably grant him a degree. (Don Knuth)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2021-04-08 15:32:52 Re: [HACKERS] Custom compression methods
Previous Message Kazutaka Onishi 2021-04-08 15:25:58 Re: TRUNCATE on foreign table