Re: [bug fix] PITR corrupts the database cluster

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, MauMau <maumau307(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [bug fix] PITR corrupts the database cluster
Date: 2013-07-24 14:57:14
Message-ID: 15076.1374677834@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> The only thing here that really bothers me is that a crash during DROP
> DATABASE/TABLESPACE could leave us with a partially populated db/ts
> that's still accessible through the system catalogs. ...
> I guess one thing we could do is create a flag file, say
> "dead.dont.use", in the database's default-tablespace directory, and
> make new backends check for that before being willing to start up in
> that database; then make sure that removal of that file is the last
> step in DROP DATABASE.

After a bit of experimentation, it seems there's a pre-existing way that
we could do this: just remove PG_VERSION from the database's default
directory as the first filesystem action in DROP DATABASE. If we
crash before committing, subsequent attempts to connect to that database
would fail like this:

$ psql bogus
psql: FATAL: "base/176774" is not a valid data directory
DETAIL: File "base/176774/PG_VERSION" is missing.

which is probably already good enough, though maybe we could add a HINT
suggesting that the DB was incompletely dropped.

To ensure this is replayed properly on slave servers, I'd be inclined to
mechanize it by (1) changing remove_dbtablespaces to ensure that the
DB's default tablespace is the first one dropped, and (2) incorporating
remove-PG_VERSION-first into rmtree().

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-07-24 15:11:34 Re: [bug fix] PITR corrupts the database cluster
Previous Message Vik Fearing 2013-07-24 14:04:03 Re: Insert result does not match record count