Re: Vacuum-full very slow

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Steve Crawford <scrawford(at)pinpointresearch(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Vacuum-full very slow
Date: 2007-04-25 21:08:49
Message-ID: 5905.1177535329@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Steve Crawford wrote:
>> Am I reading that what it actually does is to thrash around keeping
>> indexes unnecessarily updated, bloating them in the process?

> Yes.

Just for the record, it's not "unnecessary". The point of that is to
not leave a corrupted table behind if VACUUM FULL fails midway through.
The algorithm is:

1. copy tuples to lower blocks, inserting index entries for them too

During this stage, if we fail then the copied tuples are invalid (since
they were inserted by a failed transaction) and so no corruption.
Meanwhile the original tuples are marked as "moved by this vacuum
transaction", but their validity is not affected by that.

2. mark the transaction committed

This atomically causes all the copied tuples to be GOOD and all the
originals to be INVALID according to the tuple validity rules.

3. remove the index entries for moved-off tuples

If we crash here, some of the invalid tuples will have index entries
and some won't, but that doesn't matter because they're invalid.
(The next vacuum will take care of finishing the cleanup.)

4. remove the moved-off tuples (which just requires truncating the
table)

I don't see a way to remove the old index entries before inserting new
ones without creating a window where the index and table will be
inconsistent if vacuum fails.

CLUSTER avoids all this thrashing by recopying the whole table, but
of course that has peak space requirements approximately twice the
table size (and is probably not a win anyway unless most of the table
rows need to be moved). You pays your money, you takes your choice.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Listmail 2007-04-25 22:13:13 Re: Vacuum-full very slow
Previous Message Thomas F. O'Connell 2007-04-25 20:26:19 Re: [DOCS] Incrementally Updated Backups: Docs Clarification

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2007-04-25 21:14:17 Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect
Previous Message Gustavo Tonini 2007-04-25 20:47:34 Re: Fragmentation project