Re: Resume vacuum and autovacuum from interruption and cancellation

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Resume vacuum and autovacuum from interruption and cancellation
Date: 2019-11-05 06:57:07
Message-ID: CA+fd4k5QKsuzwqwuKt=jiuo+7LDmHkdjTQ=xEa=-KGKVV4YTUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 2 Nov 2019 at 02:10, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Thu, Aug 8, 2019 at 9:42 AM Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com> wrote:
> > Sounds like an interesting idea, but does it really help? Because if
> > vacuum was interrupted previously, wouldn't it already know the dead
> > tuples, etc in the next run quite quickly, as the VM, FSM is already
> > updated for the page in the previous run.
>
> +1. I don't deny that a patch like this could sometimes save
> something, but it doesn't seem like it would save all that much all
> that often. If your autovacuum runs are being frequently cancelled,
> that's going to be a big problem, I think.

I've observed the case where user wants to cancel a very long running
autovacuum (sometimes for anti-wraparound) for doing DDL or something
maintenance works. If the table is very large autovacuum could take a
long time and they might not reclaim garbage enough.

> And as Rafia says, even
> though you might do a little extra work reclaiming garbage from
> subsequently-modified pages toward the beginning of the table, it
> would be unusual if they'd *all* been modified. Plus, if they've
> recently been modified, they're more likely to be in cache.
>
> I think this patch really needs a test scenario or demonstration of
> some kind to prove that it produces a measurable benefit.

Okay. A simple test could be that we cancel a long running vacuum on a
large table that is being updated and rerun vacuum. And then we see
the garbage on that table. I'll test it.

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2019-11-05 07:41:00 Re: [proposal] recovery_target "latest"
Previous Message Amit Kapila 2019-11-05 05:58:51 Re: cost based vacuum (parallel)