Re: Problem loading pg_dump file

From: "Mason Hale" <masonhale(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Problem loading pg_dump file
Date: 2007-01-31 19:45:38
Message-ID: 8bca3aa10701311145g147093card70791c665276e39@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I was able to successfully able to dump and restore my database this
morning.

Here's what I did:

After doing single table restore to a text file of the rawfeed table (the
one the triggered the error),
I was able to get the id of the last row that was successfully exported. As
mentioned earlier I was
able to import all the data up to that point without error.

To see how many rows were loaded in my development database (where the
partial dataset was loaded) I ran:

select count(*) from rawfeed;
>> 287,478

The same table in the production database had 749,723 rows.

To get the id of the first 'bad' row, (the next one after the last
successfully imported row), I ran this in production:

select id from rawfeed offset 287477 limit 10;

I verified that the first id returned matched the id of that last row
successfully loaded on the dev server.
Then I took the id of the next row in the list (37126091) and did:

select * into rawfeed_backup where id = 37126091;
delete from rawfeed where id = 37126091;

After doing this I was able to do another dump of the entire database with
the -Fc option and was able to restore that on our dev server successfully.
BTW -- this includes dumping/restoring the rawfeed_backup table containing
the one 'bad row'.

Then I noticed was that our original dump file from 1 week ago was 7GB, and
the one today was 14GB.
We've had a lot of db activity, but I doubt our database has doubled in size
in just one week.

Now I'm thinking that the 7GB dump file was somehow truncated or aborted
before it was finished. (I ran the pg_dump as a background job, and didn't
capture the output... so there may have been an error I didn't see).

Does this file truncation theory sound consistent with the symptoms I
reported? Or does the 'single corrupted row of data' theory ring truer?

Mason

> > ***(Single step mode: verify
> > command)*******************************************
> > COPY blocked_info (id, created_at, reason_code, note, do_count_links)
> FROM
> > stdin;
> > ***(press return to proceed or enter x and return to
> > cancel)********************
> > ERROR: invalid input syntax for integer: "2006-10-09 22:55:58"
> > CONTEXT: COPY blocked_info, line 1, column id: "2006-10-09 22:55:58"
> >
> > The part of the script that is responsible for this error is:
> >
> > COPY blocked_info (id, created_at, reason_code, note, do_count_links)
> FROM
> > stdin;
> > 1 2006-10-09 22:55:58 0 \N \N
> > 2 2006-10-09 22:55:58 0 \N \N
> > 3 2006-10-09 22:55:58 0 \N \N
> > 4 2006-10-09 22:55:58 0 \N \N
>
> That's completely bizarre ... it seems like it's just lost the first
> field of the COPY data, which is not a failure I've ever heard of
> before. I have no theory about that at the moment.

I'm mystified too. Once I catch my breath, I circle back to take a closer
look at this. But for now, I can backup and restore the database. If I find
something more, I'll be sure to send a follow-up message.

Mason

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Mark Walker 2007-01-31 20:41:29 Re: Any Plans for cross database queries on the same server?
Previous Message Richard Troy 2007-01-31 19:28:02 Re: Any Plans for cross database queries on the same server?