Re: Feedback on getting rid of VACUUM FULL

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Heikki <hlinnaka(at)iki(dot)fi>
Subject: Re: Feedback on getting rid of VACUUM FULL
Date: 2015-04-29 17:03:28
Message-ID: CA+Tgmoay-hU6sQaaMJOsQrF2hJdVzSXmWP8diGj1Nnvdb780LA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 28, 2015 at 2:44 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
>> > I think what we need here is something that does heap_update to tuples
>> > at the end of the table, moving them to earlier pages; then wait for old
>> > snapshots to die (the infrastructure for which we have now, thanks to
>> > CREATE INDEX CONCURRENTLY); then truncate the empty pages. Of course,
>> > there are lots of details to resolve. It doesn't really matter that
>> > this runs for long: a process doing this for hours might be better than
>> > AccessExclusiveLock on the table for a much shorter period.
>>
>> Why do you need to do anything other than update the tuples and let
>> autovacuum clean up the mess?
>
> Sure, that's one option. I think autovac's current approach is too
> heavyweight: it always has to scan the whole relation and all the
> indexes. It might be more convenient to do something more
> fine-grained; for instance, maybe instead of scanning the whole
> relation, start from the end of the relation walking backwards and stop
> once the first page containing a live or recently-dead tuple is found.
> Perhaps, while scanning the indexes you know that all CTIDs with pages
> higher than some threshold value are gone; you can remove them without
> scanning the heap at all perhaps.

I agree that scanning all of the indexes is awfully heavy-weight, but
I don't see how we're going to get around that. The problem with
index vac is not that it's expensive to decide which CTIDs need to get
killed, but that we have to search for them in every page of the
index. Unfortunately, I have no idea how to get around that. The
only alternative approach is to regenerate the index tuples we expect
to find based on the heap tuples we're killing and search the index
for them one at a time. Tom's been opposed to that in the past, but
maybe it's worth reconsidering.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2015-04-29 17:04:06 Re: Can pg_dump make use of CURRENT/SESSION_USER
Previous Message Robert Haas 2015-04-29 16:59:35 Re: Missing psql tab-completion for ALTER FOREIGN TABLE