Re: Proposal: More flexible backup/restore via pg_dump

From: Philip Warner <pjw(at)rhyme(dot)com(dot)au>
To: Giles Lean <giles(at)nemeton(dot)com(dot)au>
Cc: Zeugswetter Andreas SB <ZeugswetterA(at)wien(dot)spardat(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proposal: More flexible backup/restore via pg_dump
Date: 2000-06-27 09:07:03
Message-ID: 3.0.5.32.20000627190703.02029100@mail.rhyme.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At 07:00 27/06/00 +1000, Giles Lean wrote:
>
>Are you are also assuming that a backup fits in a single file,
>i.e. that anyone with >2GB of backup has some sort of large file
>support?

That's up to the format used to save the database; in the case of the
'custom' format, yes. But that is the size after compression. This is not
substantially different to pg_dump's behaviour, except that pg_dump can be
piped to a tape drive...

The objective of the API components are to (a) make it very easy to add new
metadata to dump (eg. tablespaces), and (b) make it easy to add new output
formats (eg. tar archives). Basically the metadata dumping side makes one
call to register the thing to be saved, passing an optional function
pointer to dump data (eg. table contents) - this *could* even be used to
implement dumping of BLOBs.

The 'archiver' format provider must have some basic IO routines:
Read/WriteBuf and Read/WriteByte and has a number of hook functions which
it can use to output the data. It needs to provide at least one function
that actually writes data somewhere. It also has to provide the associated
function to read the data.

>
>As someone else answered: no. You can't portably assume random access
>to tape blocks.

This is probably an issue. One of the motivations for this utility it to
allow partial restores (eg. table data for one table only), and
arbitrarilly ordered restores. But I may have a solution:

write the schema and TOC out at the start of the file/tape, then compressed
data with headers for each indicating which TOC item they correspond to.
This metadata can be loaded into /tmp, so fseek is possible. The actual
data restoration (assuming constraints are not defined [THIS IS A PROBLEM])
can be done by scanning the rest of the tape in it's own order since RI
will not be an issue. I think I'm happy with this.

But the problem is the constraints: AFAIK there is no 'ALTER TABLE ADD
CONSTRAINT...' so PK, FK, Not Null constraints have to be applied before
data load (*please* tell me I'm wrong). This also means that for large
databases, I should apply indexes to make PK/FK checks fast, but they will
slow data load.

Any ideas?

>> The output scheme will be encapsulated, and in the initial version will be
>> a custom format (since I can't see an API for tar files)
>
>You can use a standard format without there being a standard API.

Being a relatively lazy person, I was hoping to leave that as an excercise
for the reader...

>Using either tar or cpio format as defined for POSIX would allow a lot
>of us to understand your on-tape format with a very low burden on you
>for documentation. (If you do go this route you might want to think
>about cpio format; it is less restrictive about filename length than
>tar.)

Tom Lane was also very favorably disposed to tar format. As I said above,
the archive interfaces should be pretty amenable to adding tar support -
it's just I'd like to get a version working with custom and directory based
formats to ensure the flexibility is there. As I see it, the 'backup to
directory' format should be easy to use as a basis for the 'backup to tar'
code.

The problem I have with tar is that it does not support random access to
the associated data. For reordering large backups, or (ultimately) single
BLOB extraction, this is a performance problem.

If you have a tar spec (or suitably licenced code), please mail it to me,
and I'll be able to make more informed comments.

>Presumably you'd expect this file I/O to be through some standard API
>that other backends would also use? I'd be interested to see this;
>I've got code for an experimental libtar somewhere around here, so I
>could offer comments at least.

No problem: I should have a working version pretty soon. The API is
strictly purpose-built; it would be adaptable to a more general archibe
format, but as you say, tar is fine for most purposes.

>> BUT AFAIK, fseek does not work on STDOUT, and at the current time pg_backup
>> will use fseek.
>
>Not using fseek() would be a win if you can see a way to do it.

I think I probably can if I can work my way around RI problems.
Unfortunately the most effective solution will be to allow reording of the
table data restoration order, but that requires multiple passes through the
file to find the table data...

Bye for now,

Philip

----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.C.N. 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Mount 2000-06-27 09:13:14 RE: SQL99 functions
Previous Message Hiroshi Inoue 2000-06-27 08:53:10 Re: Big 7.1 open items