Re: autovacuum: change priority of the vacuumed tables

From: Jim Nasby <jim(dot)nasby(at)openscg(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Grigory Smolkin <g(dot)smolkin(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: autovacuum: change priority of the vacuumed tables
Date: 2018-03-03 19:32:33
Message-ID: efe9bebf-faa6-e2fd-6b2e-7af2363ae3db@openscg.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/19/18 10:00 AM, Tomas Vondra wrote:
> So I don't think this is a very promising approach, unfortunately.
>
> What I think might work is having a separate pool of autovac workers,
> dedicated to these high-priority tables. That still would not guarantee
> the high-priority tables are vacuumed immediately, but at least that
> they are not stuck in the worker queue behind low-priority ones.
>
> I wonder if we could detect tables with high update/delete activity and
> promote them to high-priority automatically. The reasoning is that by
> delaying the cleanup for those tables would result in significantly more
> bloat than for those with low update/delete activity.

I've looked at this stuff in the past, and I think that the first step
in trying to improve autovacuum needs to be allowing for a much more
granular means of controlling worker table selection, and exposing that
ability. There are simply too many different scenarios to try and
account for to try and make a single policy that will satisfy everyone.
Just as a simple example, OLTP databases (especially with small queue
tables) have very different vacuum needs than data warehouses.

One fairly simple option would be to simply replace the logic that
currently builds a worker's table list with running a query via SPI.
That would allow for prioritizing important tables. It could also reduce
the problem of workers getting "stuck" on a ton of large tables by
taking into consideration the total number of pages/tuples a list contains.

A more fine-grained approach would be to have workers make a new
selection after every vacuum they complete. That would provide the
ultimate in control, since you'd be able to see exactly what all the
other workers are doing.
--
Jim Nasby, Chief Data Architect, Austin TX
OpenSCG http://OpenSCG.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2018-03-03 19:37:03 Re: pgbench - allow to specify scale as a size
Previous Message Justin Pryzby 2018-03-03 18:43:41 Re: Function to track shmem reinit time