Skip site navigation (1) Skip section navigation (2)

Resumable vacuum proposal and design overview

From: Galy Lee <lee(dot)galy(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Resumable vacuum proposal and design overview
Date: 2007-02-26 09:21:50
Message-ID: 45E2A6AE.1080805@oss.ntt.co.jp (view raw or flat)
Thread:
Lists: pgsql-hackers
Hi

We are developing a new feature for vacuum, here is a brief overview
about it.

Introduction
------------

A) What is it?

This feature enables vacuum has resumable capability. Vacuum can
remembers the point it stops, then resumes interrupted vacuum operation
from the point next time.

The SQL syntaxes for this feature has the format similar to:

   # VACUUM ALL  --> It has the same functionality with current vacuum;
   # VACUUM      --> It uses the information saved by previous
                     interrupted vacuum to continue vacuum processing.

B) Why do we need it?

For large table, although it can be vacuum by enabling vacuum cost-based
delay, but the processing may last for several days. It definitely has
negative affect on system performance. So if systems which has
maintenance time, it is preferred to vacuum in maintenance window.
Vacuum task can be split into small subtasks, and they can be
scheduled into maintenance window time slot. This can reduce the impact
of vacuum to system service.

But currently vacuum task can not be split: if an interrupt or error
occurs during vacuum processing, vacuum totally forgets what it has
done and terminates itself. Following vacuum on the same table has to
scan from the beginning of the heap block. This proposal enable vacuum
has capability to stop and resume.

C) How can we use it?

This feature can enable autovacuum or cron-vacuum-scheduler to develop
more sophisticated vacuum schedule schemes combined with *maintenance
window*.

For example, if the system has two hour maintenance window every day,
vacuum can be interrupted by this way:

   SET statement_timeout TO 2*3600*1000; # two hours
   VACUUM freeze talname;
   SET statement_timeout TO DEFAULT;

Or it can be interrupted by SIGINT directly.

Autovacuum or manual vacuum scheduler can split a large vacuum task
into small subtasks by this feature. Subtasks can be scheduled according
to system load.

Design Overview
---------------

A) Vacuum internal overview

Concurrent vacuum mainly has the following steps to vacuum a table:

   1. scan heap to collect dead tuple list
   2. (if the table has indexes) scan and sweep indexes
   3. sweep dead tuples collected in step 1
   4. perform additional index cleanup operation
   5. (if a certain of free space found) truncate table
   6. register free space to FSM
   7. update statistics of the table

If maintenance memory is not sufficient to contain all of dead tuples,
step 1-3 are repeated.

C) Where to stop

The first option is that it can accept stop request at boundary
of each steps, and it resumes from the unfinished step. But some
steps takes a short time, some take a long time. For example,
step 1 takes more than 40% of total time, but step 5 and 6 take
less than 1% of total time. This granularity is too larger.

The second option is to accept stop request before vacuum begin to
process one of the blocks in step 1-4. In current vacuum implementation,
vacuum_delay_point is also placed in such locations. This option has a
much good granularity than option 1.

This implementation accepts stop request at *blocks level* in step 1-4.

D) How to stop and resume

- stop:

    When vacuum stop in step 1-4, vacuum perform following things:
    vacuum saves dead tuple list, the heap block number on which it
    stop, unswept index relations, unswept dead tuple and FreezeLimit
    into a disk file.

    Free space information collected in step 1-3 can be registered to
    FSM when vacuum is interrupted.

- resume:

    When vacuum is resuming, it reads out saved information and skip the
    finished operations, then continue to finish remaining operations.

    There are two additional issues which need to be discussed here:
     *) FreezeLimit.
        There are two options to select FreezeLimit for a resuming a
        vacuum:(a) FreezeLimit of interrupted vacuum, (b) FreezeLimit of
        resuming vacuum. FreezeLimit-(b) is safe. But for the heap
        blocks are  not full scanned, so when FreezeLimit-(b) is used ,
        the relfrozenxid should be updated with FreezeLimit-(a) at the
        end of vacuum, and CLOG can only be truncated by FreezeLimit-(a).

     *) Resuming index operation.
        There are two possible resuming levels when vacuum is interrupted
        in step 2:(a) skip the *index relations* which have been swept
        completely (b) skip the *index blocks* which have been swept.
        Level (a) is safe and simple to be implemented; level (b) need to
        consider the scenarios that leaf page is split; further
        investigation is needed to clarify if it is safe or not.

        This implementation adopts *level (a) resuming*.

3. Implementation Plan
----------------------

We are working on the patch now; I will send the WIP patch to the list
later. I am sorry this late proposal, but I hope it can go into 8.3.

Welcome your comments and ideas.

Best Regards
Galy Lee (lee(dot)galy(at)oss(dot)ntt(dot)co(dot)jp)
NTT Open Source Software Center

Responses

pgsql-hackers by date

Next:From: Markus SchiltknechtDate: 2007-02-26 09:42:11
Subject: Re: SCMS question
Previous:From: Michael PaesoldDate: 2007-02-26 09:10:47
Subject: Re: conversion efforts (Re: SCMS question)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group