Re: online debloatification (was: extending relations more efficiently)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Jeroen Vermeulen <jtv(at)xs4all(dot)nl>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: online debloatification (was: extending relations more efficiently)
Date: 2012-05-02 18:14:31
Message-ID: CA+Tgmoab9Vd3N4zZQaTaMPxDi0d37T-w8AQ1y555pSCzd_=N4w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 2, 2012 at 1:06 PM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
> Excerpts from Robert Haas's message of mié may 02 12:55:17 -0400 2012:
>> On Wed, May 2, 2012 at 12:46 PM, Alvaro Herrera
>> <alvherre(at)commandprompt(dot)com> wrote:
>> > Agreed.  Perhaps to solve this issue what we need is a way to migrate
>> > tuples from later pages into earlier ones.  (This was one of the points,
>> > never resolved, that we discussed during the VACUUM FULL rework.)
>>
>> Yeah, I agree.  And frankly, we need to find a way to make it work
>> without taking AccessExclusiveLock on the relation.  Having to run
>> VACUUM FULL is killing actual users and scaring off potential ones.
>
> And ideally without bloating the indexes while at it.

Yeah.

Brainstorming wildly, how about something like this:

1. Insert a new copy of the tuple onto some other heap page. The new
tuple's xmin will be that of the process doing the tuple move, and
we'll also set a flag indicating that a move is in progress.
2. Set a flag on the old tuple, indicating that a tuple move is in
progress. Set its TID to the new location of the tuple. Set xmax to
the tuple mover's XID. Optionally, truncate away the old tuple data,
leaving just the tuple header.
3. Scan all indexes and replace any references to the old tuple's TID
with references to the new tuple's TID.
4. Commit.
5. Once the XID of the tuple mover is all-visible, nuke the old TID
and clear the flag on the new tuple indicating a move-in-progress
(these two operations must be done together, atomically, with a single
WAL record covering both).

Any scan that encounters the old tuple will decide whether or not it
can see the tuple based on the xmin & xmax in the old tuple's header.
If it decides it can see it, it follows the TID pointer and does its
work using the new tuple instead. Scans that encounter the new tuple
need no special handling; the existing visibility rules are fine for
that case. Prune operations must not truncate away tuples that are
being moved into or out of the page, and vacuum must not mark pages
containing such tuples as all-visible.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2012-05-02 18:15:46 Re: Have we out-grown Flex?
Previous Message Peter Eisentraut 2012-05-02 18:04:59 Re: plpython crash (PG 92)