Re: 7.3.1 takes long time to vacuum table?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Mark Cave-Ayland" <m(dot)cave-ayland(at)webbased(dot)co(dot)uk>
Cc: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, shridhar_daithankar(at)persistent(dot)co(dot)in, "PostgreSQL General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: 7.3.1 takes long time to vacuum table?
Date: 2003-02-19 16:28:13
Message-ID: 11914.1045672093@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Mark Cave-Ayland" <m(dot)cave-ayland(at)webbased(dot)co(dot)uk> writes:
> Interestingly this could be used to create a speedy vacuum - that is,
> create a new table with a temporary name that is invisible to the
> database (similar to dropped columns), then taking into account the disk
> space left on the device, pick the last X pages from the old table and
> write to the new table. Then truncate the file containing the table at
> point X and repeat until finished. Finally kill the old table and make
> the new one visible.

And if you crash midway through?

> Why does vacuum bother with reordering rows?

It's designed to be fast when there's not very much data motion required
(ie, you only need to pull a relatively small number of rows off the end
to fill in the holes elsewhere).

I have not seen any actual evidence that doing it any other way would be
faster. Yes, it's reading the source tuples backwards instead of
forwards, but that's at most a third of the total I/O load (you've also
got tuple output and WAL writes to think about). It's not clear that
any kernel read-ahead optimization could get a chance to work anyhow.

> In fact, my colleague has just done a test with SELECT..INTO on our dev
> version and it compacted 600Mb -> 400Mb in just 40s(!). We then did a
> vacuum full on the same original 600Mb table which is still going after
> 20mins.

Are there indexes on the original table? If so, this isn't a fair
comparison.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Steve Crawford 2003-02-19 16:31:21 Re: continuous data from stdin
Previous Message Mark Cubitt 2003-02-19 16:26:01 creating table