Re: need for in-place upgrades (was Re: State of

From: Christopher Browne <cbbrowne(at)acm(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: need for in-place upgrades (was Re: State of
Date: 2003-09-15 03:05:51
Message-ID: m31xuiwy28.fsf@wolfe.cbbrowne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

In the last exciting episode, ron(dot)l(dot)johnson(at)cox(dot)net (Ron Johnson) wrote:
> On Sun, 2003-09-14 at 14:17, Christopher Browne wrote:
>> <http://spectralogic.com> discusses how to use their hardware and
>> software products to do terabytes of backups in an hour. They sell a
>> software product called "Alexandria" that knows how to (at least
>> somewhat) intelligently backup SAP R/3, Oracle, Informix, and Sybase
>> systems. (When I was at American Airlines, that was the software in
>> use._
>
> HP, Hitachi, and a number of other vendors make similar hardware.
>
> You mean the database vendors don't build that parallelism into
> their backup procedures?

They don't necessarily build every conceivable bit of possible
functionality into the backup procedures they provide, if that's what
you mean.

Of thee systems mentioned, I'm most familiar with SAP's backup
regimen; if you're using it with Oracle, you'll use tools called
"brbackup" and "brarchive", which provide a _moderately_ sophisticated
scheme for dealing with backing things up.

But if you need to do something wild, involving having two servers
each having 8 tape drives on a nearby servers that are used to manage
backups for a whole cluster of systems, including a combination of OS
backups, DB backups, and application backups, it's _not_ reasonable to
expect one DB vendor's backup tools to be totally adequate to that.

Alexandria (and similar software) certainly needs tool support from DB
makers to allow them to intelligently handle streaming the data out of
the databases.

At present, this unfortunately _isn't_ something PostgreSQL does, from
two perspectives:

1. You can't simply keep the WALs and reapply them in order to bring
a second database up to date;

2. A pg_dump doesn't provide a way of streaming parts of the
database in parallel, at least not if all the data is in
one database. (There's some nifty stuff in eRServ that
might eventually be relevant, but probably not yet...)

There are partial answers:

- If there are multiple databases, starting multiple pg_dump
sessions provides some useful parallelism;

- A suitable logical volume manager may allow splitting off
a copy atomically, and then you can grab the resulting data
in "strips" to pull it in parallel.

Life isn't always perfect.

>> Generally, this involves having a bunch of tape drives that are
>> simultaneously streaming different parts of the backup.
>>
>> When it's Oracle that's in use, a common strategy involves
>> periodically doing a "hot" backup (so you can quickly get back to a
>> known database state), and then having a robot tape drive assigned
>> to regularly push archive logs to tape as they are produced.
>
> Rdb does the same thing. You mean DB/2 can't/doesn't do that?

I haven't the foggiest idea, although I would be somewhat surprised if
it doesn't have something of the sort.
--
(reverse (concatenate 'string "moc.enworbbc" "@" "enworbbc"))
http://www.ntlug.org/~cbbrowne/wp.html
Rules of the Evil Overlord #139. "If I'm sitting in my camp, hear a
twig snap, start to investigate, then encounter a small woodland
creature, I will send out some scouts anyway just to be on the safe
side. (If they disappear into the foliage, I will not send out another
patrol; I will break out napalm and Agent Orange.)"
<http://www.eviloverlord.com/>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Avi Schwartz 2003-09-15 03:23:33 is General Bits Issue # 43 correct?
Previous Message Muhyiddin A.M Hayat 2003-09-15 03:01:49 The NT services Cygwin PostgreSQL installatio