Re: Avoiding second heap scan in VACUUM

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Avoiding second heap scan in VACUUM
Date: 2008-05-28 20:32:06
Message-ID: 1212006726.4489.665.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Wed, 2008-05-28 at 16:56 +0530, Pavan Deolasee wrote:

> 2. It then waits for all the existing transactions to finish to make
> sure that everyone can see the change in the pg_class row

I'm not happy that the VACUUM waits. It might wait a very long time and
cause worse overall performance than the impact of the second scan.

Happily, I think we already have a solution to this overall problem
elsewhere in the code. When we VACUUM away all the index entries on a
page we don't yet remove it. We only add it to the FSM on the second
pass of that page on the *next* VACUUM.

So the idea is to have one pass per VACUUM, but make that one pass do
the first pass of *this* VACUUM and the second pass of the *last*
VACUUM.

We mark the xid of the VACUUM in pg_class as you suggest, but we do it
after VACUUM has completed the pass.

In single pass we mark DEAD line pointers as RECENTLY_DEAD. If the last
VACUUM xid is old enough we mark RECENTLY_DEAD as UNUSED, as well,
during this first pass. If last xid is not old enough we do second pass
to remove them.

That has the effect that large tables that are infrequently VACUUMed
will need only a single scan. Smaller tables that require almost
continual VACUUMing will probably do two scans, but who cares?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2008-05-28 20:47:35 Re: BUG #4204: COPY to table with FK has memory leak
Previous Message Tom Lane 2008-05-28 20:28:27 Re: BUG #4204: COPY to table with FK has memory leak