Re: Feedback on getting rid of VACUUM FULL

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback on getting rid of VACUUM FULL
Date: 2009-09-16 21:43:45
Message-ID: 603c8f070909161443n46f36466mbc615cd2d4d47ac9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 16, 2009 at 4:53 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Hannu Krosing wrote:
>> On Wed, 2009-09-16 at 21:23 +0300, Heikki Linnakangas wrote:
>>> 2) Another utility that does something like UPDATE ... WHERE ctid > ? to
>>> move tuples to lower pages. It will be different from current VACUUM
>>> FULL in some ways. It won't require a table lock, for example, but it
>>> won't be able to move update chains as nicely. But it would be trivial
>>> to write one, so I think we should offer that as a contrib module.
>>
>> I have not checked, but I suspect pg_reorg may already be doing
>> something similar http://pgfoundry.org/forum/forum.php?forum_id=1561
>
> Hmm, AFAICT pg_reorg is much more complex, writing stuff to a temp table
> and swapping relfilenodes afterwards. More like the VACUUM REWRITE
> that's been discussed.
>
> For the kicks, I looked at what it would take to write a utility like
> that. It turns out to be quite trivial, patch attached. It uses the same
> principle as VACUUM FULL, scans from the end, moving tuples to
> lower-numbered pages until it can't do it anymore. It requires a small
> change to heap_update(), to override the preference to store the new
> tuple on the same page as the old one, but other than that, it's all in
> the external module.
>
> To test:
>
> -- Create and populate test table
> CREATE TABLE foo (id int4 PRIMARY KEY);
> INSERT INTO foo SELECT a FROM generate_series(1,100000) a;
>
> -- Delete a lot of tuples from the beginning. This creates the hole that
> we want to compact out.
> DELETE FROM foo WHERE id < 90000;
>
> -- Vacuum to remove the dead tuples
> VACUUM VERBOSE foo;
>
> -- Run the utility to "move" the tuples
> SELECT vacuumfull('foo');
>
> -- Vacuum table again to remove the old tuple versions of the moved rows
> and truncate the file.
>
> VACUUM VERBOSE foo;

I think this should be in core, not a contrib module.

I also wonder whether we should consider teaching regular VACUUM to do
a little of this every time it's run. Right now, once your table gets
bloated, it stays bloated forever, until you intervene. Making it
slowly get better by itself would reduce the number of people who live
with the problem for a month or a year before writing in to say
"Access to this table seems really slow...".

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2009-09-16 21:44:58 Re: PATCH: make plpgsql IN args mutable (v1) [REVIEW]
Previous Message Josh Berkus 2009-09-16 21:31:16 Re: PATCH: make plpgsql IN args mutable (v1) [REVIEW]