Re: pg_upgrade project status

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_upgrade project status
Date: 2009-01-29 03:12:10
Message-ID: 603c8f070901281912l79c92d8ctb5064d29f66cac9a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 28, 2009 at 6:05 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Well, what it really means is that all the special-purpose conversion
> code is in SQL instead of C. Which is sort of good as long as whatever
> transformation you have in mind can be done easily in SQL. (Good luck
> if you need to control the OIDs of the inserted rows, for instance.
> And I *really* want to see Zdenek handle conversion of stored-rule query
> trees in SQL...) But far more importantly, it doesn't fix the problem
> that you have to write conversion code in the first place.
>
> The appeal of the pg_dump approach is that it will automatically handle
> everything that there exists a plain-SQL representation for, which is to
> say darn near everything. We will need special purpose code to deal
> with the dropped-column and TOAST-oid issues, but that can probably be
> written in SQL if it makes anyone feel better to do so ;-). The more
> important point is that once we're done with those two issues, we're
> done, and will stay done for the foreseeable future (at least with
> respect to catalog upgrades).
>
> I am not sure why everyone is so hot to create a conversion path that
> guarantees extra maintenance pain for every future catalog
> reorganization, when it would be no more complex to create one without
> such a burden.

Well, I don't personally believe that there will be no extra
maintenance pain associated with the pg_dump approach. In fact, the
extra maintenance pain will be exactly proportional to the difference
between the "darn near everything" that it handles and "everything".
Basically, every time you invent a feature that can do things to a
system catalog that aren't visible at the SQL-level, you're going to
experience the searing pain of having to invent SQL-ish syntax that
can be dumped-and-restored without losing that mysterious system
catalog magic.

The first problem with that is that it is really ugly. Full stop.

The second problem with that is that you are relying on your ability
to translate tuples in a database table into text format (and not,
mind you, the same text format that we normally use to back up and
restore databases, but some modified text format with special hacks
that are only used when we need to represent things like dropped
columns) and then to translate that text format back into a set of new
tables that are semantically identical to the original ones in every
particular. To my way of thinking, this is a Rube Goldberg machine.
Transforming one set of tuples into a slightly different set of tuples
using SQL seems way less prone to errors and omissions.

I also kind of think that it might open the door to using the system
catalogs to indicate things like "the earliest page version that
appears in relation X". There's no joy in inventing some kind of
pg_dump syntax for that sort of thing just so that you can set it
properly when someone does an in-place upgrade. It's useless for
normal operation since any NON-uip relations will always have whatever
the current version is in that field, and it feels wrong for users to
have the ability to fiddle with that value via SQL anyway.

With respect to the specific problems you mention, OIDs are definitely
an issue but do you think that's an insurmountable obstacle? Seems
like we should be able to find a hammer large enough to solve that
problem. As for rules, just because the core of the engine is written
in SQL doesn't mean that it can't make outcalls to C functions; we
already have an interface for that. It is better than writing the
whole thing in C, to be sure...

I don't know, I'm not completely sure how hard this will be, or which
approach is better. But it sounds to me like this has potential, if
done right.

...Robert

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message KaiGai Kohei 2009-01-29 03:15:27 Re: How to get SE-PostgreSQL acceptable
Previous Message Stephen Frost 2009-01-29 03:08:57 Re: How to get SE-PostgreSQL acceptable