RE: Proposal: More flexible backup/restore via pg_dump

From: Philip Warner <pjw(at)rhyme(dot)com(dot)au>
To: Peter Mount <petermount(at)it(dot)maidstone(dot)gov(dot)uk>, Giles Lean <giles(at)nemeton(dot)com(dot)au>
Cc: Zeugswetter Andreas SB <ZeugswetterA(at)wien(dot)spardat(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: RE: Proposal: More flexible backup/restore via pg_dump
Date: 2000-06-27 10:30:58
Message-ID: 3.0.5.32.20000627203058.023cb870@mail.rhyme.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At 10:55 27/06/00 +0100, Peter Mount wrote:
>comments prefixed with PM...
>
>PM: So can most other Unix based formats. On the intranet server here, I
>pg_dump into /tmp, then include them in a tar piped to the tape drive.

Since you are using an intermediate file, there would be no change. fseek
is only an issue on tape drives and pipes, not files on disk. I suspect
that most people can afford the overhead of restoring the backup file to
disk before restoring the database, but it'd be nice to start out as
flexible as possible. In the back of my mind is the fact that when the WAL
and storage manager are going, raw data backups should be possible (and
fast) - but then again, maybe it's a pipe dream.

>PM: The problem with blobs hasn't been with dumping them (I have some Java
>code that does it into a compressed zip file), but restoring them - you
>can't create a blob with a specific OID, so any references in existing
>tables will break. I currently get round it by updating the tables after the
>restore - but it's ugly and easy to break :(

I assumed this would have to happen - hence why it will not be in the first
version. With all the TOAST stuff coming, and talk about the storage
manager, I still live in hope of a better BLOB system...

>PM: Having a set of api's (either accessible directly into the backend,
>and/or via some fastpath call) would be useful indeed.

By this I assume you mean APIs to get to database data, not backup data.

>
>This is probably an issue. One of the motivations for this utility it to
>allow partial restores (eg. table data for one table only), and
>arbitrarilly ordered restores. But I may have a solution:
>
>PM: That would be useful. I don't know about CPIO, but tar stores the TOC at
>the start of each file (so you can actually join two tar files together and
>still read all the files). In this way, you could put the table name as the
>"filename" in the header, so partial restores could be done.

Well, the way my plans work, I'll use either a munged OID, or a arbitrary
unique ID as the file name. All meaningful access has to go via the TOC.
But that's just a detail; the basic idea is what I'm implementing.

It's very tempting to say tape restores are only possible in the order in
which the backup file was written ('pg_restore --stream' ?), and that
reordering/selection is only possible if you put the file on disk.

>
>PM: How about IOCTL's? I know that ArcServe on both NT & Unixware can seek
>through the tape, so there must be a way of doing it.

Maybe; I know BackupExec also does some kind of seek to update the TOC at
end of a backup (which is what I need to do). Then again, maybe that's just
a rewind. I don't want to get into custom tape formats...

Do we have any tape experts out there?

>PM: Tar can do this sequentially, which I've had to do many times over the
>years - restoring just one file from a tape, sequential access is probably
>the only way.

It's just nasty when the user reorders the restoration of tables and
metadata. In the worst cast it might be hundreds of scans of the tape. I'd
hate to have my name associated with something so unfriendly (even if it is
the operators fault).

>PM: The tar spec should be around somewhere - just be careful, the obvious
>source I was thinking of would be GPL'd, and we don't want to be poluted :-)

That was my problem. I've got some references now, and I'll look at them.
At least everything I've written so far can be put in PG.

----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.C.N. 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/

Browse pgsql-hackers by date

  From Date Subject
Next Message Grigori Soloviov 2000-06-27 10:41:25 Any Documentation on PL/pgSQL?
Previous Message Hannu Krosing 2000-06-27 10:06:40 Re: SQL99 functions