Restartable VACUUM design overview version 2

From: Galy Lee <lee(dot)galy(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Restartable VACUUM design overview version 2
Date: 2007-03-05 09:48:29
Message-ID: 45EBE76D.3000307@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

Thanks for a lot of feadback and good ideas on the restartable vacuum.
Here is a new design overview of it based on previous discussions.

There are several ideas to address the problem of long running VACUUM
in a defined maintenance window. One idea might be: when maintenance
time is running out, we can do the following things:
- send a smart stop request to vacuum and vacuum can stop at a right
place.(It might take a long time.)
- change the cost delay setting of vacuum on-the-fly to make vacuum
less aggressive.

The followings are the discussions for them.

Restartable vacuum design overview
----------------------------------

* Where to stop:

There are two approaches to stop a vacuum:

(1) tell VACUUM where to stop when it is starting
VACUUM can be told to stop at a right point when it starts (By SQL
syntax like: VACUUM SOME). The optional stop point is after one
full fill-workmem-clean-index-clean-deadtuple cycle. VACUUM stops
when it has finished such a cycle.

(2) interrupt VACUUM when it is running.
Another approach is to interrupt the running VACUUM. VACUUM checks
for a smart stop request at normal vacuum delay points, if such a
request is detected, a flag is set to tell VACUUM to stop at a right
point. VACUUM stops at the end of one full fill-workmem-clean-index
-clean-deadtuple cycle.

But I can not figure out a simple way to request a running VACUUM to
stop in (2), for the signals of backend have been use up. (1) is
simple to be implemented, for it doesn’t require a communication
with running VACUUM.

* How to stop

When VACUUM is stopping,
- it saves the block number that it had reached to pg_class;
- it also updates the free space information to FSM. (This might be
posted by a separated patch.)

* How to restart:

When VACUUM is restarting, it reads the stored block from pg_class to
restart the interrupted scan.

"Change VACUUM cost delay settings on-the-fly" feature
------------------------------------------------------

When the end of maintenance window comes, we might notify VACUUM to use
a set of less aggressive cost delay setting.

I don’t have a clear idea on how to implement this feature yet. Maybe
we need a message passing mechanism between backbends to exchange the
cost delay setting like a patch in here:
http://archives.postgresql.org/pgsql-patches/2006-04/msg00047.php

Another simple way to achieve this is to use the setting for different
maintenance window in system catalog. There are some previous
discussions about the implementation of maintenance window, but further
discussions still have not been raised. So it is seems that it is better
to implement this feature after the implementation of maintenance
window.

Implementation plan
-------------------

Changing VACUUM cost delay setting on-the-fly requires a internal
massage passing mechanism or the implementation of maintenance window,
maybe it is not a good timing to rush for it now. But I hope the
*restartable VACUUM feature* can be accepted for 8.3.

Hope your comments and suggestions.

Best Regards

Galy Lee <lee.galy _at_ ntt.oss.co.jp>
NTT Open Source Software Center

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Luke Lonergan 2007-03-05 09:58:55 Re: Bug: Buffer cache is not scan resistant
Previous Message Hannu Krosing 2007-03-05 09:41:49 Re: Bug: Buffer cache is not scan resistant