Re: remove flatfiles.c

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: remove flatfiles.c
Date: 2009-09-02 19:30:26
Message-ID: 407d949e0909021230q40d520far51de9e63162f5861@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 2, 2009 at 8:10 PM, Robert Haas<robertmhaas(at)gmail(dot)com> wrote:
> I confess to being a little fuzzy on the details of how this
> implementation (seq-scanning the source table for live tuples) is
> different/better from the current VACUUM FULL implementation.  Can
> someone fill me in?

VACUUM FULL is a *lot* more complex.

It scans pages *backwards* from the end (which does wonderful things
on rotating media). Marks each live tuple it finds as "moved off",
finds a new place for it (using the free space map I think?). Insert
the tuple on the new page and marks it "moved in" and updates the
indexes.

Then it commits the transaction but keeps the lock. Then it has to
vacuum all the indexes of the references to the old tuples at the end
of the table. I think it has to commit that too before it can finally
truncate the table.

The backwards scan is awful for rotating media. The reading from the
end and writing to the beginning is bad too, though hopefully the
cache can help that.

A lot of the complexity comes in from other parts of the system that
have to be aware of tuples that have been "moved off" or "moved in".
They have to be able to check whether the vacuum committed or not.

That reminds me there was another proposal to do an "online" vacuum
full similar to our concurrent index builds. Do noop-updates to tuples
at the end of the table, hopefully finding space for them earlier in
the table. Wait until those transactions are no longer visible to
anyone else and then truncate. (Actually I think you could just not do
anything and let regular lazy vacuum do the truncate). That might be a
good practical alternative for sites where copying their entire table
isn't practical.

--
greg
http://mit.edu/~gsstark/resume.pdf

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christian Gonzalez 2009-09-02 19:37:00 Re: c function: keep objects in memory for all session or all transaction
Previous Message Andrew Dunstan 2009-09-02 19:17:23 Re: c function: keep objects in memory for all session or all transaction