Re: Problem loading pg_dump file

From: "Mason Hale" <masonhale(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Problem loading pg_dump file
Date: 2007-01-30 17:57:34
Message-ID: 8bca3aa10701300957v1c45dd37ia9c7532e3f5e6a9c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I've done a bit more digging into this, here's what I've found --

The text db dump file is much too big to edit by hand (~37GB), so I ran the
import in single-step mode:

psql -U bdu -s bdu_01_21_07 < bduprod_2-01-21-07

Here's the first error I run across:

***(Single step mode: verify
command)*******************************************
COPY blocked_info (id, created_at, reason_code, note, do_count_links) FROM
stdin;
***(press return to proceed or enter x and return to
cancel)********************
ERROR: invalid input syntax for integer: "2006-10-09 22:55:58"
CONTEXT: COPY blocked_info, line 1, column id: "2006-10-09 22:55:58"

The part of the script that is responsible for this error is:

COPY blocked_info (id, created_at, reason_code, note, do_count_links) FROM
stdin;
1 2006-10-09 22:55:58 0 \N \N
2 2006-10-09 22:55:58 0 \N \N
3 2006-10-09 22:55:58 0 \N \N
4 2006-10-09 22:55:58 0 \N \N
5 2006-10-09 22:55:58 0 \N \N
6 2006-10-10 13:03:27 \N \N \N
7 2006-10-10 13:06:28 \N \N \N
8 2006-10-10 13:09:37 \N \N \N
9 2006-10-27 22:39:49 \N \N \N
10 2006-10-27 22:39:58 \N \N \N
11 2006-11-09 04:48:18 \N \N \N
12 2006-11-30 16:03:58 \N \N f
13 2006-12-02 15:11:42 \N \N f
14 2006-12-04 12:31:20 \N \N f
15 2006-12-05 00:11:30 \N \N f
16 2006-12-05 00:15:45 \N \N f
17 2006-12-05 03:02:29 \N \N f
18 2006-12-05 12:03:10 \N \N f
20 2006-12-05 16:20:15 \N \N f
19 2006-12-05 16:20:15 \N \N f
21 2006-12-16 16:13:24 \N \N f
22 2006-12-19 16:06:43 \N \N f
23 2006-12-20 01:33:51 \N \N f
24 2006-12-21 18:38:56 \N \N f
25 2006-12-22 15:06:15 \N \N f
26 2006-12-23 09:43:17 \N \N f
27 2007-01-04 12:37:50 \N \N f
28 2007-01-08 17:33:26 \N \N f
29 2007-01-08 17:38:47 \N \N f
30 2007-01-13 15:32:34 \N \N f
\.

Now, I'm not too familiar with the copy command, but the above looks correct
to me.

When I let subsequent statements run, I get similar errors such as:

ERROR: invalid input syntax for integer: "0.261191951289869"
ERROR: invalid input syntax for type real: "2006-08-23 22:54:11.24"
ERROR: invalid input syntax for integer: "0.99655325708605502"
ERROR: invalid input syntax for type boolean: "1401353"

Any ideas what is causing this, or how to track this issue down?
------------------

In addition to the above, I did a new pg_dump using the -Fc option to
generate an archive in binary/compressed format.

I also verified that the versions of pg_dump and pg_restore on both machines
where the same (8.1.5).

I then copy that file over to our dev server an load it doing a pg_restore
-v (verbose mode). This time, I get an error when loading data into a
different table in the database (not the blocked_info table mentioned
above).

The error message is:
pg_restore: restoring data for table "rawfeed"
pg_restore: [custom archiver] could not read data block -- expected 4096,
got 3448
pg_restore: *** aborted because of error

I've tried running both with and without the -e (stop on errors) option, and
either way the pg_restore stops when it hits this error.

When I list the database contents using pg_restore -l -- the table where
error occurs is table listed as #14 out of 23. This particular table
includes a bytea column that contains gzipped data.

If I try to selectively restore any of the individual tables 1-13 (as listed
by pg_restore -l), using the -t option, I do not encounter any errors. But
if I try to restore any individual tables 14-23, I get the same error as
above (but with a different table name).

In addition, if I try to generate a sql script from pg_restore using the -f
option, for any of the individual tables 14-23, I get the same error: "could
not read data block -- expected 4096, got 3448".

> pg_restore -t rawfeed -f rawfeed.sql bduprod_2-01-25-07
pg_restore: [custom archiver] could not read data block -- expected 4096,
got 3448

In the case of table 14 (rawfeed), an output file is generated, it is
mid-way through a copy command. I can run that partially generated script
against the db without error. I can also verify that the last record in the
script is successfully added to the db.

Note: this generated file to restore this one table is huge: 9.2G by itself.
Is there an upper limit to the amount of data copy can load at one time?

If I generate a sql script for any of tables 15-23, it takes a long time for
the command to finally return the same error (about the same amount of time
to run the 'pg_restore -t rawfeed ...' variation above), and when it does an
output file is generated, but the file only contains the schema creation
commands and the first line of the copy command. It does not contain any of
the table data. This is true even for a table that contains only 1 row of
data.

----

At this point I'm not sure how to proceed.

My suspicion is still that this has something to do with the encoding of
data in the database. So I'm trying to narrow down the location of the
problem so that I can try to clean it up.

From trying to load the data from the pg_dump text format export, it looks
like something is wrong with the copy command when loading the blocked_info
table. However when trying to load the pg_dump -Fc binary format export, it
appears there's some problem with the rawfeed table.

Any ideas on what to try next will be greatly appreciated.

thanks in advance,
Mason

On 1/25/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> "Mason Hale" <masonhale(at)gmail(dot)com> writes:
> > I'm having a problem loading a recent pg_dump of our production
> database.
>
> > However, when trying to load the file for this month's snapshot, we are
> (for
> > the first time) seeing a slew of errors, such as:
>
> > invalid command \N
> > invalid command \N
> > ERROR: syntax error at or near ""/>\n <img alt="" style="" at
> character 1
> > LINE 1: "/>\n <img alt="" style="border: 0;
> > ^
>
> You need to look at the very first error, and ignore the slew following
> it. What seems to have happened here is that an error in the COPY command
> caused psql to fall out of copy mode (or perhaps never enter it in the
> first place) and start trying to treat lines of COPY data as SQL
> commands. So, tons of noise. What was the first error?
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jeff Davis 2007-01-30 18:42:29 Re: PostgreSQL 9.0
Previous Message Rich Shepard 2007-01-30 17:41:06 Re: pg migrator