Re: [GENERAL] Restart after power outage: createdb

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: lapham(at)jandr(dot)org, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [GENERAL] Restart after power outage: createdb
Date: 2006-09-27 20:44:36
Message-ID: 20060927204436.GU19827@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Wed, Sep 27, 2006 at 04:13:34PM -0400, Tom Lane wrote:
> Jon Lapham <lapham(at)jandr(dot)org> writes in pgsql-general:
> > If I run...
> > sleep 3; echo starting; createdb bar
> > ...and power off the VM while the "createdb bar" is running.
>
> > Upon restart, about 50% of the time I can reproduce the following error
> > message:
>
> > [lapham(at)localhost ~]$ psql bar
> > psql: FATAL: database "bar" does not exist
> > [lapham(at)localhost ~]$ createdb bar
> > createdb: database creation failed: ERROR: could not create directory
> > "base/65536": File exists
>
> What apparently is happening here is that the same OID has been assigned
> to the new database both times. Even though the createdb didn't
> complete, the directory it started to build is there and so there's a
> filename collision.
>
> > So, running "createdb bar" a second time works.
>
> Yeah, because the OID counter has been advanced, and so the second
> createdb uses a nonconflicting OID.
>
> In theory this scenario should not happen, because a crash-and-restart
> is supposed to guarantee that the OID counter comes up at or beyond
> where it was before the crash.
>
> After thinking about it for awhile, I believe the problem is that
> CREATE DATABASE is breaking the "WAL rule": it's allowing a data change
> (specifically, creation of the new DB subdirectory) to hit disk without
> having guaranteed that associated WAL entries were flushed first.
> Specifically, if we generated an XLOG_NEXTOID WAL entry to record the
> consumption of an OID for the database, there isn't anything ensuring
> that record gets to disk before the mkdir occurs. (ie, the comment in
> XLogPutNextOid is correct as far as it goes, but it fails to account
> for outside-the-database effects such as creation of a directory named
> after the OID.) Hence after restart the OID counter might not get
> advanced as far as it should have been.
>
> We could fix this two different ways:
>
> 1. Put an XLogFlush into createdb() somewhere between making the
> pg_database entry and starting to create subdirectories.
>
> 2. Check for conflicting database directories while assigning the OID,
> comparable to what GetNewRelFileNode() does for table files.
>
> #2 has some appeal because it could deal with random junk in
> $PGDATA/base regardless of how the junk got there. However, to do that
> in a really bulletproof way we'd have to check all the tablespace
> directories too, and that's starting to get a tad tedious for something
> that shouldn't happen anyway.
>
> So I'm leaning to #1 as a suitably low-effort fix. Thoughts?

It'd be nice to clean things up, but I understand the reluctance to do
so. Maybe a good compromise would be to warn about files that are
present in $PGDATA but don't show up in any catalogs.

Then again, if we're doing that, we could probably just nuke 'em...
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2006-09-27 20:52:51 Re: [GENERAL] Restart after power outage: createdb
Previous Message Ray Stell 2006-09-27 20:27:25 Re: How to Examine a view

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-09-27 20:52:51 Re: [GENERAL] Restart after power outage: createdb
Previous Message Chris Browne 2006-09-27 20:26:12 Re: PostgreSQL HA questions