Re: What pg_restore does to a non-empty target database

From: "Ken Winter" <ken(at)sunward(dot)org>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "'PostgreSQL pg-general List'" <pgsql-general(at)postgresql(dot)org>
Subject: Re: What pg_restore does to a non-empty target database
Date: 2008-01-15 01:11:24
Message-ID: 004a01c85713$8d037bb0$6703a8c0@KenIBM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Based on Tom Lane's response, here is version 2 of my attempt to document
what pg_restore does to a target database that already contains objects.

Version 2 has been limited to the case where pg_dump was run with the
--column-inserts option and pg_restore is run with the --clean option.
Also, when there is a possible difference between what pg_restore "tries" to
do (i.e. the SQL actions it invokes) and what actually happens, Version 2
indicates the latter on a line labeled "Exception:".

Version 2 depends on the answer to the following question: Is it true that
the sequence of SQL commands that is extracted into a file by:

pg_restore --file=foo.sql [other arguments] foo.bak

is exactly the sequence of SQL commands that will be executed in database
foo by the command:

pg_restore --dbname=foo [other arguments] foo.bak

(assuming the same [other arguments] in both cases)?

If so, having looked at the extracted SQL from several examples, I'm
thinking it's a safe bet that no pg_restore DROP, DELETE, or CREATE action
will ever fail (except when a DROP's target object does not exist), because
pg_restore sequences the "cleaning" activities so they get rid of dependent
objects first. Right?

If so, that would seem to guarantee that every action that pg_restore tries
to do in the narrative below is guaranteed to succeed, except INSERTs, which
can fail only as described below.

If not, I have another version of the story below that is a lot more
qualified and complicated.

As before, I solicit your confirmations, corrections, and additions of this
document, hoping to get it to the point where my project team (and anyone
else who wants it) can use it with confidence.

~ Thanks
~ Ken

---------------------------

WHAT PG_RESTORE DOES
Version 2

Given a pg_restore command, where:
* A is the source archive file (as filtered and reordered by a ToC file,
if any) produced by pg_dump with "--format=t" and "--column-inserts"
options.
* T is the target database.
* O is a database object (table, function, etc) that exists in A
and/or in C.
* The pg_restore command has the --clean option, and it does not have the
--table= or --trigger= or "--exit-on-error or --disable-triggers options.

The following are the changes that the pg_restore command will produce in
database T.

EFFECTS ON SCHEMA OBJECTS

If object O exists in both A and T, pg_restore:
* Drops T's version of O.
* Creates A's version of O.
If object O exists in T but not in A, pg_restore:
* Leaves T's version of O unchanged.
If object O exists in A but not in T, pg_restore:
* Creates A's version of O.

EFFECTS ON TABLES AND THEIR DATA

Suppose in addition that:
* Database object O is a base table.
* O contains data in both archive A and target database T.

If the command says "--data-only", pg_restore:
* Leaves T's schema definition of O unchanged.
* Tries to delete all of T's O data.
If this causes FK violations, the result depends row-by-row
on
the ON DELETE action of the FK constraint.
* Tries to insert all of A's O data.
The INSERT of any row that causes constraint violations
or other fatal errors (see below) fails.
If the command says "--schema-only", pg_restore:
* Drops T's version of O, which deletes T's O data as a side-effect.
* Create A's version of O.
* Does not try to insert any of A's O data, so O ends up empty.
If the command says "--data-only" and "--schema-only", pg_restore:
* Leaves O and its data unchanged.
If the command says neither "--data-only" nor "--schema-only", pg_restore:
* Drops T's version of O.
* Assumes that T's O data were deleted (as a side-product of the
DROP)
* Creates A's version of O.
* Inserts all of A's O data.

EFFECTS OF DIFFERENCES BETWEEN A AND T TABLE SCHEMAS

Suppose in addition that:
* The pg_restore command says "--data-only".
* T's schema definition of table O is different from A's.

If column O.C exists in T's schema but not in A's:
* O.C is set to Null in all rows that pg_restore inserts.
If column O.C exists in A's schema but not in T's:
* The O.C values are lost from all rows that pg_restore inserts.
If column O.C exists in both schemas with incompatible types:
* All inserts of rows from A fail.
If constraint K exists in T's schema but not in A's:
* Inserts of rows from A that would violate K fail.
If constraint K exists in A's schema but not in T's:
* K has no effect on the insertion of rows from A.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Gregory Stark 2008-01-15 02:51:42 Re: Index trouble with 8.3b4
Previous Message Joshua D. Drake 2008-01-15 00:11:55 Registration for PostgreSQL Conference East now open