Re: [WIP] In-place upgrade

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com>, Greg Stark <greg(dot)stark(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] In-place upgrade
Date: 2008-11-06 23:51:20
Message-ID: Pine.GSO.4.64.0811061754120.15452@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 6 Nov 2008, Tom Lane wrote:

> Another thought here is that I don't think we are yet committed to any
> changes that require extra space between 8.3 and 8.4, are we? The
> proposed addition of CRC words could be put off to 8.5, for instance.

I was just staring at that code as you wrote this thinking about the same
thing. CRCs are a great feature I'd really like to see. On the other
hand, announcing that 8.4 features in-place upgrades for 8.3 databases,
and that the project has laid the infrastructure such that future releases
will also upgrade in-place, would IMHO be the biggest positive
announcement of the new release by a large margin. At least then new
large (>1TB) installs could kick off on either the stable 8.3 or 8.4
knowing they'd never be forced to deal with dump/reload, whereas right now
there is no reasonable solution for them that involves PostgreSQL (I just
crossed 3TB on a system last month and I'm not looking forward to its
future upgrades).

Two questions come to mind here:

-If you reduce the page layout upgrade problem to "convert from V4 to V5
adding support for CRCs", is there a worthwhile simpler path to handling
that without dragging the full complexity of the older page layout changes
in?

-Is it worth considering making CRCs an optional compile-time feature, and
that (for now at least) you couldn't get them and the in-place upgrade at
the same time?

Stepping back for a second, the idea that in-place upgrade is only
worthwhile if it yields zero downtime isn't necessarily the case. Even
having an offline-only upgrade tool to handle the more complicated
situations where tuples have to be squeezed onto another page would still
be a major improvement over the current situation. The thing that you
have to recognize here is that dump/reload is extremely slow because of
bottlenecks in the COPY process. That makes for a large amount of
downtime--many hours isn't unusual.

If older version upgrade downtime was reduced to how long it takes to run
a "must scan every page and fiddle with it if full" tool, that would still
be a giant improvement over the current state of things. If Zdenek's
figures that only a small percentages of pages will need such adjustment
holds up, that should take only some factor longer than a sequential scan
of the whole database. That's not instant, but it's at least an order of
magnitude faster than a dump/reload on a big system.

The idea that you're going to get in-place upgrade all the way back to 8.2
without taking the database down for a even little bit to run such a
utility is hard to pull off, and it's impressive that Zdenek and everyone
else involved has gotten so close to doing it. I personally am on the
fence as to whether it's worth paying even the 1% penalty for that
implementation all the time just to get in-place upgrades. If an offline
utility with reasonable (scan instead of dump/reload) downtime and closer
to zero overhead when finished was available instead, that might be a more
reasonable trade-off to make for handling older releases. There are so
many bottlenecks in the older versions that you're less likely to find a
database too large to dump and reload there anyway. It would also be the
case that improvements to that offline utility could continue after 8.4
proper was completely frozen.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bernd Helmle 2008-11-07 00:13:57 Re: Patch for ALTER DATABASE WITH TABLESPACE
Previous Message Joshua Tolley 2008-11-06 23:22:16 Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets