Re: Make CLUSTER MVCC-safe

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Make CLUSTER MVCC-safe
Date: 2007-03-21 21:59:44
Message-ID: 200703212159.l2LLxiL09836@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches


Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------

Heikki Linnakangas wrote:
> This patch makes CLUSTER MVCC-safe. Visibility information and update
> chains are preserved like in VACUUM FULL.
>
> I created a new generic rewriteheap-facility to handle rewriting tables
> in a visibility-preserving manner. All the update chain tracking is done
> in rewriteheap.c, the caller is responsible for supplying the stream of
> tuples.
>
> CLUSTER is currently the only user of the facility, but I'm envisioning
> we might have other users in the future. For example, a version of
> VACUUM FULL that rewrites the whole table. We could also use it to make
> ALTER TABLE MVCC-safe, but there's some issues with that. For example,
> what to do if RECENTLY_DEAD tuples don't satisfy a newly added constraint.
>
> One complication in the implementation was the fact that heap_insert
> overwrites the visibility information, and it doesn't write the full
> tuple header to WAL. I ended up implementing a special-purpose
> raw_heap_insert function instead, which is optimized for bulk inserting
> a lot of tuples, knowing that we have exclusive access to the heap.
> raw_heap_insert keeps the current buffer locked over calls, until it
> gets full, and inserts the whole page to WAL as a single record using
> the existing XLOG_HEAP_NEWPAGE record type.
>
> This makes CLUSTER a more viable alternative to VACUUM FULL. One
> motivation for making CLUSTER MVCC-safe is that if some poor soul runs
> pg_dump to make a backup concurrently with CLUSTER, the clustered tables
> will appear to be empty in the dump file.
>
> The documentation doesn't anything about CLUSTER not being MVCC-safe, so
> I suppose there's no need to touch the docs. I sent a doc patch earlier
> to add a note about it, that doc patch should still be applied to older
> release branches, IMO.
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-patches by date

  From Date Subject
Next Message Alvaro Herrera 2007-03-21 22:35:31 WIP multiworker autovacuum
Previous Message Simon Riggs 2007-03-21 19:44:40 Re: Make CLUSTER MVCC-safe