Re: Patch for disaster recovery

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Fuhr <mike(at)fuhr(dot)org>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Patch for disaster recovery
Date: 2005-02-20 07:21:00
Message-ID: 12145.1108884060@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Michael Fuhr <mike(at)fuhr(dot)org> writes:
> Hmmm...after seeing Tom's reply, I suppose I should have first
> asked, "Gee, looks simple, but does it work?" Should I even bother
> experimenting with it in a test environment?

Experiment away, but I have a hard time visualizing how you'll find it
useful.

Just brainstorming here, but it seems to me that what might make some
kind of sense is a command to "undelete all tuples in this table".
You do that, you look through them, you delete the versions you don't
want, you're happy. The problem with the patch as-is is that (a)
"deleting the versions you don't want" is a no-op, so you cannot keep
straight what you've done in terms of filtering out garbage; and (b)
when you revert to a non-broken postmaster, the stuff you couldn't see
before goes back to being unseeable, because after all you didn't change
its state.

With either the snapshot kluge or the undelete-all kluge, you have an
issue in that constraints are broken wholesale --- you can see lots of
duplicate row versions that violate unique constraints, deleted versions
that violate FK constraints because they reference also-deleted master
rows, deleted versions that violate later-added CHECK constraints, etc.
I'd sort of like to see the system flip into some mode that says "we're
not promising constraints are honored", and then you can't go back to
normal operation without going through some pushup that checks all the
remaining rows satisfy the declared constraints.

In any case I suspect it's a bad idea to treat tuples as good if their
originating transaction did not commit. For starters, such a tuple
might not possess all the index entries it should (if the originating
transaction failed before inserting said entries). I think what we
want to think about is overriding delete commands, not overriding
insert failures.

Not sure where this leads to, but it's not leading to an undocumented
one-line hack in tqual.c, and definitely not *that* one-line hack.

regards, tom lane

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Michael Fuhr 2005-02-20 08:35:25 Re: Patch for disaster recovery
Previous Message Michael Fuhr 2005-02-20 06:57:43 Re: Patch for disaster recovery