Re: GIN pending list cleanup during autoanalyze blocks cleanup by VACUUM

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>
Subject: Re: GIN pending list cleanup during autoanalyze blocks cleanup by VACUUM
Date: 2021-10-14 07:49:30
Message-ID: CAD21AoCgVKSJxoKctGPCHWofC0OCxefBZdgC27w=xGT6D2xYiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 14, 2021 at 7:58 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Sat, Oct 9, 2021 at 5:51 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > While I'm no closer to a backpatchable fix than I was on Thursday, I
> > do have some more ideas about what to do on HEAD. I now lean towards
> > completely ripping analyze_only calls out, there -- the whole idea of
> > calling amvacuumcleanup() routines during autoanalyze (but not plain
> > ANALYZE) seems bolted on. It's not just the risk of similar problems
> > cropping up in the future -- it's that the whole approach seems
> > obsolete. We now generally expect autovacuum to run against
> > insert-only tables.
>
> Attached patch removes calls to each index's amvacuumcleanup() routine
> that take place during ANALYZE. These days we can just rely on
> autovacuum to run against insert-only tables (assuming the user didn't
> go out of their way to disable that behavior).

Looking at the original commit, as you mentioned, ISTM performing
pending list cleanup during (auto)analyze (and analyze_only) was
introduced to perform the pending list cleanup on insert-only tables.
Now that we have autovacuum_vacuum_insert_threshold, we don’t
necessarily need to rely on that.

On the other hand, I still see a little value in performing the
pending list cleanup during autoanalyze. For example, if the user
wants to clean up the pending list frequently in the background (so
that it's not triggered in the INSERT path), it might be better to do
that during autoanalyze than autovacuum. If the table has garbage,
autovacuum has to vacuum all indexes and the table, taking a very long
time. But autoanalyze can be done in a shorter time. If we trigger
autoanalyze frequently and perform pending list cleanup, the pending
list cleanup can also be done in a relatively short time, preventing
MVCC snapshots from being held for a long time.

Therefore, I personally think that it's better to eliminate
analyze_only code after introducing a way that allows us to perform
the pending list cleanup more frequently. I think that the idea of
Jaime Casanova's patch is a good solution.

>
> Having thought about it some more, I have arrived at the conclusion
> that we should backpatch this to Postgres 13, the first version that
> had insert-driven autovacuums (following commit b07642db). This
> approach is unorthodox, because it amounts to disabling a
> theoretically-working feature in the backbranches. Also, I'd be
> drawing the line at Postgres 13, due only to the quite accidental fact
> that that's the first major release that clearly doesn't need this
> mechanism. (As it happens Nikolay was on 12 anyway, so this won't work
> for him, but he already has a workaround IIUC.)
>
> I reached this conclusion because I can't think of a non-invasive fix,
> and I really don't want to go there. At the same time, this behavior
> is barely documented, and is potentially very harmful indeed. I'm sure
> that we should get rid of it on HEAD, but getting rid of it a couple
> of years earlier seems prudent.
>
> Does anybody have any opinion on this, either in favor or against my
> backpatch-to-13 proposal?

I'm not very positive about back-patching. The first reason is what I
described above; I still see little value in performing pending list
cleanup during autoanalyze. Another reason is that if the user relies
on autoanalyze to perform pending list cleanup, they have to enable
autovacuum_vacuum_insert_threshold instead during the minor upgrade.
Since it also means to trigger autovacuum in more cases I think it
will have a profound impact on the existing system.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-10-14 07:53:24 Re: Inconsistent behavior of pg_dump/pg_restore on DEFAULT PRIVILEGES
Previous Message Simon Riggs 2021-10-14 07:48:12 Re: Next Steps with Hash Indexes