Re: [WIP] In-place upgrade

From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] In-place upgrade
Date: 2008-11-03 20:36:07
Message-ID: 490F60B7.4030409@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Big thanks for review.

Robert Haas napsal(a):
> I tried to apply this patch to CVS HEAD and it blew up all over the
> place. It doesn't seem to be intended to apply against CVS HEAD; for
> example, I don't have backend/access/heap/htup.c at all, so can't
> apply changes to that file.

You need to apply also two other patches:
which are located here:
http://wiki.postgresql.org/wiki/CommitFestInProgress#Upgrade-in-place_and_related_issues
I moved one related patch from another category here to correct place.

The problem is that it is difficult to keep it in sync with head, because they
change a lot of things. It the reason why I put all also into GIT repository,
but ...

> I was able to clone the GIT repository
> with the following command...
>
> git clone http://git.postgresql.org/git/~davidfetter/upgrade_in_place/.git
>
> ...but now I'm confused, because I don't see the changes from the diff
> reflected in the resulting tree. As you can see, I am not a git
> wizard. Any help would be appreciated.

I'm GIT newbie I use mercurial for development and I manually applied changes
into GIT. I asked David Fetter with help how to get back the correct clone. In
meantime you can download a tarball.

http://git.postgresql.org/?p=~davidfetter/upgrade_in_place/.git;a=snapshot;h=c72bafada59ed278ffac59657c913bc375f77808;sf=tgz

It should contains every think including yesterdays improvements (delete,
insert, update works - inser/update only on table without index).

> Here are a few initial thoughts based mostly on reading the diff:
>
> In the minor nit department, I don't really like the idea of
> PageHeaderData_04, SizeOfPageHeaderData04, PageLayoutIsValid_04, etc.
> I think the latest version should just be PageHeaderData and
> SizeOfPageHeaderData, and previous versions should be, e.g.
> PageHeaderDataV3. It looks to me like this would cut a few hunks out
> of this and maybe make it a bit easier to understand what is going on.
> At any rate, if we are going to stick with an explicit version number
> in both versions, it should be marked in a consistent way, not _04
> sometimes and just 04 other times. My suggestion is e.g. "V4" but
> YMMV.

Yeah, it is most difficult part :-) find correct names for it. I think that each
version of structure should have version suffix including lastone. And of
cource the last one we should have a general name without suffix - see example:

typedef struct PageHeaderData_04 { ...} PageHeaderData_04
typedef struct PageHeaderData_03 { ...} PageHeaderData_03
typedef PageHeaderData_04 PageHeaderData

This allows you exactly specify version on places where you need it and keep
general name where version is not relevant.

How suffix should looks it another question. I prefer to have 04 not only 4.
What's about PageHeaderData_V04?

By the way what YMMV means?

> The changes to nodeIndexscan.c and nodeSeqscan.c are worrisome to me.
> It looks like the added code is (nearly?) identical in both places, so
> probably it needs to be refactored to avoid code duplication. I'm
> also a bit skeptical about the idea of doing the tuple conversion
> here. Why here rather than ExecStoreTuple()? If you decide to
> convert the tuple, you can palloc the new one, pfree the old one if
> ShouldFree is set, and reset shouldFree to true.

Good point. I thought about it as a one variant. And if I look it close now it
is really much better place. It should fix a problem why REINDEX does not work.
I will move it.

> I am pretty skeptical of the idea that all of the HeapTuple* functions
> can just be conditionalized on the page version and everything will
> Just Work. It seems like that is too low a level to be worrying about
> such things. Even if it happens to work for the changes between V3
> and V4, what happens when V5 or V6 is changed in such a way that the
> answer to HeapTupleIsWhatever is neither "Yes" nor "No", but rather
> "Maybe" or "Seven"? The performance hit also sounds painful. I don't
> have a better idea right now though...

OK. Currently it works (or I hope that it works). If somebody in a future invent
some special change, i think in most (maybe all) cases there will be possible
mapping.

The speed is key point. When I check it last time I go 1% performance drop in
fresh database. I think 1% is good price for in-place online upgrade.

> I think it's going to be absolutely imperative to begin vacuuming away
> old V3 pages as quickly as possible after the upgrade. If you go with
> the approach of converting the tuple in, or just before,
> ExecStoreTuple, then you're going to introduce a lot of overhead when
> working with V3 pages. I think that's fine. You should plan to do
> your in-place upgrade at 1AM on Christmas morning (or whenever your
> load hits rock bottom...) and immediately start converting the
> database, starting with your most important and smallest tables. In
> fact, I would look whenever possible for ways to make the V4 case a
> fast-path and just accept that the system is going to labor a bit when
> dealing with V3 stuff. Any overhead you introduce when dealing with
> V3 pages can go away; any V4 overhead is permanent and therefore much
> more difficult to accept.

Yes, it is a plan to improve vacuum to convert old page to new one. But in as a
second step. I have already page converter code. With some modification it could
be integrated easily into vacuum code.

> That's about all I have for now... if you can give me some pointers on
> working with this git repository, or provide a complete patch that
> applies cleanly to CVS HEAD, I will try to look at this in more
> detail.

Thanks for your comments. Try snapshot link. I hope that it will work.

Zdenek

PS: I'm sorry about response time, but I'm on training this week.

--
Zdenek Kotala Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-11-03 20:38:28 Second thoughts about pg_typeof
Previous Message Sebastian Böhm 2008-11-03 19:42:54 Re: reliable lock inside stored procedure (SOLVED)