Re: missing data/global

From: Daniel Kalchev <daniel(at)digsys(dot)bg>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: missing data/global
Date: 2004-08-23 20:02:49
Message-ID: 200408232002.i7NK2og11336@dcave.digsys.bg
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom,

This is basically what I had in mind, but you described it better than I ever
could :)

What I need from this database system is just one database and probably not
all of the tables anyway (but some do seem valuable). This database happens to
be second in the pg_dumpall script. The next databases are rather big (and I
actually have more recent backup and could eventually recreate the data) The
valuable database hasn't had significant structure changes since the backup.

Looking at the files, I am confident which is the proper database oid - if
this cannot be properly fixed, is there .. reasonable way to dump data from
the (heap) files?

Here is what I have:

su-2.02# du
1747 ./base/1
1693 ./base/16555
1 ./base/77573557/pgsql_tmp
127036 ./base/77573557
1 ./base/13255137/pgsql_tmp
1379190 ./base/13255137
11246 ./base/95521309
1781 ./base/96388007
1 ./base/133512058/pgsql_tmp
11933861 ./base/133512058
13456555 ./base
98209 ./pg_xlog
41315 ./pg_clog
13596100 .

My database should be with oid 77573557, template0 is apparently 16555

Let's see how all this works.

Daniel

>>>Tom Lane said:
> Daniel Kalchev <daniel(at)digsys(dot)bg> writes:
> > Is there ANY chance to recover data from a database system that suffered d
isk
> > crash, and is not missing the data/global directory?
> > Version is 7.2.4. Database files seem to be intact as well as pg_clog and
> > pg_xlog directories.
>
> The hard part I think would be reconstructing pg_database, because you'd
> need to get the database OIDs right. I can't think of any way to do
> that that doesn't involve poking at the file with a hex editor.
>
> Here's a sketch of how I'd proceed:
>
> 1. Make a tar backup of what you have! That way you can start over
> after you screw up ;-)
>
> 2. I assume you know the names and properties of your databases, users,
> and groups if any; also the SYSID numbers for the users and groups.
> A recent pg_dumpall script would be a good place to get this info.
>
> 3. You're also going to need to figure out the OIDs of your databases
> (the OIDs are the same as the names of their subdirectories under
> $PGDATA/base). Possibly you can do this just from directory/file sizes.
> Note that template1 should be OID 1, and template0 will have the next
> lowest number (probably 16555, in 7.2).
>
> 4. Initdb a scratch database in some other place (or move aside your
> existing files, if that seems safer). In this scratch DB, create
> databases, users, and groups to match your old setup. You should be
> able to duplicate everything except the database OIDs using standard
> SQL commands.
>
> 5. Shut down scratch postmaster, then hex-edit pg_database to insert the
> correct OIDs. Use pg_filedump or a similar tool to verify that you did
> this properly.
>
> 6. Restart scratch postmaster, and VACUUM FREEZE pg_database, pg_shadow,
> and pg_group (from any database). This will make the next step safe.
>
> 7. Stop scratch postmaster, and then copy over its $PGDATA/global
> directory into the old DB.
>
> 8. Cross your fingers and start postmaster ...
>
> This will probably *not* work if you had been doing anything to
> pg_database, pg_shadow, or pg_group between your last checkpoint and the
> crash, because the reconstructed tables are not going to be physically
> identical to what they were before, so any actions replayed from WAL
> against those tables will be wrong. Hopefully you won't have that
> problem. If you do, it might work to shut down the postmaster and again
> copy the scratch $PGDATA/global directory into the old DB, thereby
> overwriting what the WAL replay did. This is getting into the realm of
> speculation though.
>
> regards, tom lane

Browse pgsql-hackers by date

  From Date Subject
Next Message Manfred Spraul 2004-08-23 20:19:20 Re: fsync and hardware write cache
Previous Message Tom Lane 2004-08-23 19:20:09 Re: returning modified input from C functions