Re: State of Beta 2

From: Network Administrator <netadmin(at)vcsn(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: kar(at)kakidata(dot)dk, PgSQL General ML <pgsql-general(at)postgresql(dot)org>
Subject: Re: State of Beta 2
Date: 2003-09-14 14:27:07
Message-ID: 1063549627.3f647abb71f4a@webmail.vcsn.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Not that I know anything about the internal workings of PG but it seems like a
big part of the issue is the on disk representation of database. I've never had
a problem with the whole dump/restore process and in fact anyone that has been
doing this long enough will remember when that process was gospel associated
with db upgrades. However, with 24x7 opertations or in general anyone who
simply can NOT tolerant the downtown to do an upgrade I wondering if there is
perhaps a way to abstract the on disk representation of PG data so that 1)
Future upgrades to not have to maintain the same structure if it is deem another
respresentation is better 2) Upgrade could be down in place.

The abstraction I am talking about would be a logical layer that would handle
disk I/O including the format of that data (lets call this the ADH). By
abstracting that information, the upgrade concerns *could* because if, "I
upgrade to say 7.2.x to 7.3.x or 7.4.x, do I *want* to take advantage of the new
disk representation. If yes, then you would have go through the necessary
process of upgrading the database with would always default to the most current
representation. If not, then because the ADH is abstact to the application, it
could run in a 7.2.x or 7.3.x "compatibility mode" so that you would not *need*
to do the dump and restore.

Again, I am completely ignorant to how this really works (and I don't have time
to read through the code) but I what I think I'm getting at is a DBI/DBD type
scenario. As a result, there would be another layer of complexity and I would
think some performance loss as well but how much complexity and performance loss
to me is the question and when you juxtapose that against the ability to do
upgrades without the dump/restore I would think many organizations would say,
"ok, I'll take the x% performance hit and wait util I have the resources to
upgrade disk representation"

One of the things involved with in Philadelphia is
providing IT services to social service programs for outsourced agencies of the
local government. In particular, there have been and are active moves in PA to
have these social service datawarehouses go up. Even though it will probably
take years to actually realize this, by that time once you aggregate all the
local agency databases together, we're going to be talking about very large
datasets. That means that (at least for) social service programs, IT is going
to have to take into account this whole upgrade question from what I think will
be a standpoint of availability. In short, I don't think it is too far off to
consider that the "little guys" will need to do reliable "in place" upgrades
with 100% confidence.

Hopefully, I was clear on my macro-concept even if I got the micro-concepts wrong.

Quoting Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> Kaare Rasmussen <kar(at)kakidata(dot)dk> writes:
> >> "interesting" category. It is in the category of things that will only
> >> happen if people pony up money to pay someone to do uninteresting work.
> >> And for all the ranting, I've not seen any ponying.
>
> > Just for the record now that there's an argument that big companies need
> 24x7
> > - could you or someone else with knowledge of what's involved give a
> > guesstimate of how many ponies we're talking. Is it one man month, one man
>
> > year, more, or what?
>
> Well, the first thing that needs to happen is to redesign and
> reimplement pg_upgrade so that it works with current releases and is
> trustworthy for enterprise installations (the original script version
> depended far too much on being run by someone who knew what they were
> doing, I thought). I guess that might take, say, six months for one
> well-qualified hacker. But it would be an open-ended commitment,
> because pg_upgrade only really solves the problem of installing new
> system catalogs. Any time we do something that affects the contents or
> placement of user table and index files, someone would have to figure
> out and implement a migration strategy.
>
> Some examples of things we have done recently that could not be handled
> without much more work: modifying heap tuple headers to conserve
> storage, changing the on-disk representation of array values, fixing
> hash indexes. Examples of probable future changes that will take work:
> adding tablespaces, adding point-in-time recovery, fixing the interval
> datatype, generalizing locale support so you can have more than one
> locale per installation.
>
> It could be that once pg_upgrade exists in a production-ready form,
> PG developers will voluntarily do that extra work themselves. But
> I doubt it (and if it did happen that way, it would mean a significant
> slowdown in the rate of development). I think someone will have to
> commit to doing the extra work, rather than just telling other people
> what they ought to do. It could be a permanent full-time task ...
> at least until we stop finding reasons we need to change the on-disk
> data representation, which may or may not ever happen.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

--
Keith C. Perry
Director of Networks & Applications
VCSN, Inc.
http://vcsn.com

____________________________________
This email account is being host by:
VCSN, Inc : http://vcsn.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Lincoln Yeoh 2003-09-14 15:20:27 Re: need for in-place upgrades (was Re: State of
Previous Message Konstantin Goudkov 2003-09-14 07:04:10 Re: Temp tables and copy