Re: Resumable vacuum proposal and design overview

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: Galy Lee <lee(dot)galy(at)oss(dot)ntt(dot)co(dot)jp>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Resumable vacuum proposal and design overview
Date: 2007-02-27 16:37:45
Message-ID: 20070227163745.GH29041@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 27, 2007 at 11:44:28AM +0900, Galy Lee wrote:
> For example, there is one table:
> - The table is a hundreds GBs table.
> - It takes 4-8 hours to vacuum such a large table.
> - Enabling cost-based delay may make it last for 24 hours.
> - It can be vacuumed during night time for 2-4 hours.
>
> It is true there is no such restrict requirement that vacuum
> need to be interrupt immediately, but it should be stopped in an
> *predictable way*. In the above example, if we have to wait for the end
> of one full cycle of cleaning, it may take up to 8 hours for vacuum to
> stop after it has received stop request. This seems quit unacceptable.

Even with very large tables, you could likely still fit things into a
specific time frame by adjusting how much time is spent scanning for
dead tuples. The idea would be to give vacuum a target run time, and it
would monitor how much time it had remaining, taking into account how
long it should take to scan the indexes based on how long it's been
taking to scan the heap. When the amount of time left becomes less than
the estimate of the amount of time required to scan the indexes (and
clean the heap), you stop the heap scan and start scanning indexes. As
long as the IO workload on the machine doesn't vary wildly between the
heap scan and the rest of the vacuum process, I would expect this to
work out fairly well.

While not as nice as the ability to 'stop on a dime' as Tom puts it,
this would be much easier and safer to implement. If there's still a
need for something better after that we could revisit it at that time.
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-02-27 16:42:17 7.x horology regression test on Solaris buildfarm machines
Previous Message Joshua D. Drake 2007-02-27 16:31:34 Re: Developer TODO List as a PostgreSQL DB