Re: Resume vacuum and autovacuum from interruption and cancellation

From: David Steele <david(at)pgmasters(dot)net>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Resume vacuum and autovacuum from interruption and cancellation
Date: 2020-04-08 14:00:22
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2/28/20 8:56 AM, Masahiko Sawada wrote:
> According to those results, it's thought that the more we resume
> vacuum from the tail of the table, the efficiency is good. Since the
> table is being updated uniformly even during autovacuum it was more
> efficient to restart autovacuum from last position rather than from
> the beginning of the table. I think that results shows somewhat the
> benefit of this patch but I'm concerned that it might be difficult for
> users when to use this option. In practice the efficiency completely
> depends on the dispersion of updated pages, and that test made pages
> dirty uniformly, which is not a common situation. So probably if we
> want this feature, I think we should automatically enable resuming
> when we can basically be sure that resuming is better. For example, we
> remember both the last vacuumed block and how many vacuum-able pages
> seems to exist from there, and we decide to resume vacuum if we can
> expect to process more many pages.

I have to say I'm a bit confused by the point of this patch. I get that
starting in progress is faster but that's only true because the entire
table is not being vacuumed?

If as you say:

> If we start to vacuum from not first block, we can update neither
> relfrozenxid nor relfrozenxmxid. And we might not be able to update
> even relation statistics.

Then we'll still need to vacuum the entire table before we can be sure
the oldest xid has been removed/frozen. If we could do those updates on
a resume then that would change my thoughts on the feature a lot.

What am I missing?

I'm marking this Returned with Feedback due concerns expressed up-thread
(and mine) and because the patch has been Waiting on Author for nearly
the entire CF.


In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2020-04-08 14:07:44 Re: adding partitioned tables to publications
Previous Message James Coleman 2020-04-08 13:54:42 Re: [PATCH] Incremental sort (was: PoC: Partial sort)