Re: Page-level version upgrade (was: Block-level CRC checks)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, decibel <decibel(at)decibel(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Page-level version upgrade (was: Block-level CRC checks)
Date: 2009-12-02 03:21:41
Message-ID: 603c8f070912011921h3ddfb589od14529ddf42fd45d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 1, 2009 at 9:31 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Robert Haas wrote:
>> On Tue, Dec 1, 2009 at 5:15 PM, Greg Stark <gsstark(at)mit(dot)edu> wrote:
>> > On Tue, Dec 1, 2009 at 9:58 PM, decibel <decibel(at)decibel(dot)org> wrote:
>> >> What happened to the work that was being done to allow a page to be upgraded
>> >> on the fly when it was read in from disk?
>> >
>> > There were no page level changes between 8.3 and 8.4.
>>
>> That's true, but I don't think it's the full and complete answer to
>> the question.  Zdenek submitted a page for CF 2008-11 which attempted
>> to add support for multiple page versions.  I guess we're on v4 right
>> now, and he was attempting to add support for v3 pages, which would
>> have allowed reading in pages from old PG versions.  To put it
>> bluntly, the code wasn't anything I would have wanted to deploy, but
>> the reason why Zdenek gave up on fixing it was because several
>> community members considerably senior to myself provided negative
>> feedback on the concept.
>
> Well, there were quite a number of open issues relating to page
> conversion:
>
>        o  Do we write the old version or just convert on read?
>        o  How do we write pages that get larger on conversion to the
>           new format?
>
> As I rember the patch allowed read/wite of old versions, which greatly
> increased its code impact.

Oh, for sure there were plenty of issues with the patch, starting with
the fact that the way it was set up led to unacceptable performance
and code complexity trade-offs. Some of my comments from the time:

http://archives.postgresql.org/pgsql-hackers/2008-11/msg00149.php
http://archives.postgresql.org/pgsql-hackers/2008-11/msg00152.php

But the point is that the concept, I think, is basically the right
one: you have to be able to read and make sense of the contents of old
page versions. There is room, at least in my book, for debate about
which operations we should support on old pages. Totally read only?
Set hit bits? Kill old tuples? Add new tuples?

The key issue, as I think Heikki identified at the time, is to figure
out how you're eventually going to get rid of the old pages. He
proposed running a pre-upgrade utility on each page to reserve the
right amount of free space.

http://archives.postgresql.org/pgsql-hackers/2008-11/msg00208.php

I don't like that solution. If the pre-upgrade utility is something
that has to be run while the database is off-line, then it defeats the
point of an in-place upgrade. If it can be run while the database is
up, I fear it will need to be deeply integrated into the server. And
since we can't know the requirements for how much space to reserve
(and it needn't be a constant) until we design the new feature, this
will likely mean backpatching a rather large chunk of complex code,
which to put it mildly, is not the sort of thing we normally would
even consider. I think a better approach is to support reading tuples
from old pages, but to write all new tuples into new pages. A
full-table rewrite (like UPDATE foo SET x = x, CLUSTER, etc.) can be
used to propel everything to the new version, with the usual tricks
for people who need to rewrite the table a piece at a time. But, this
is not religion for me. I'm fine with some other design; I just can't
presently see how to make it work.

I think the present discussion of CRC checks is an excellent test-case
for any and all ideas about how to solve this problem. If someone can
get a patch committed than can convert the 8.4 page format to an 8.5
format with the hint bits shuffled around a (hopefully optional) CRC
added, I think that'll become the de facto standard for how to handle
page format upgrades.

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-12-02 03:30:40 Re: SE-PgSQL patch review
Previous Message Bruce Momjian 2009-12-02 03:15:55 Re: SE-PgSQL patch review