Re: help with data recovery from injected UPDATE

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Gus Gutoski <shared(dot)entanglement(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general(at)postgresql(dot)org
Subject: Re: help with data recovery from injected UPDATE
Date: 2009-06-23 19:22:19
Message-ID: b42b73150906231222w1d99278am75339f86dd86210d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Jun 23, 2009 at 2:05 PM, Gus
Gutoski<shared(dot)entanglement(at)gmail(dot)com> wrote:
> Success, of sorts.  I was able to retrieve 90% the corrupted data by
> dumping the heap file.  Many thanks to those who replied with helpful
> suggestions.
>
> If you're interested in detail then read on.  Otherwise, don't bother.
>
> The data was still in the table -- I could see it using a hex editor.
> This surprised me, as autovacuum was on at the time of corruption.
> Perhaps vacuum didn't bother reclaiming storage space because the
> database is relatively small and low-traffic.
>
> The attempt at point-in-time-recovery via transaction logs was doomed
> to failure, as I do not have a file system backup from before the
> corruption.  Still, I tried Merlin's trick with pg_resetxlog to no
> avail.
>
> I tried using the pg_filedump utility to dump the heap file, but it
> wasn't what I needed.  I later discovered a souped-up utility called
> pg_dumpdata:
> http://blogs.sun.com/avalon/entry/recovering_postgres_data
> While this utility still didn't provide everything I needed, it was a
> sufficient starting point.
> (It's written for postgres 8.2, whereas I'm running 8.1 -- it
> segfaulted when I first ran it on my heap file.)
>
> I sifted through the postgres source tree looking for the code that
> reads/writes the heap files, but I couldn't make head or tail of
> anything.  In the end, it was easier to reverse engineer the format
> for user data and use the pg_dumpdata source as a base to get me to
> the "items" in the heap files.  The reason that I couldn't get 100% of
> the lost data is that the heap tuple header that points to the user
> data sometimes landed me at a random point in the middle of the item,
> rather than at the beginning.  At this point I gave up trying to get
> the last 10% of the data -- I had run out of time and patience.
>
> Having partially learned my lesson, I've set up a utility to run
> pg_dump each day.  After I've taken a break, I'll look into a
> reasonabe set-up for file system backups with point-in-time recovery.
> But really, what are the chances anything like this will ever happen
> again? ;-)

Regular scheduled pg_dump is often enough :-)

merlin

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Radcon Entec 2009-06-23 20:27:38 Separating raise notice lines
Previous Message Emanuel Calvo Franco 2009-06-23 19:19:28 Re: Replication