Re: Problem with multi-job pg_restore

From: Brian Weaver <cmdrclueless(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problem with multi-job pg_restore
Date: 2012-05-01 17:30:01
Message-ID: CAAhXZGsncrwgeHiKX5jtSiSCrHH36yoz=FKy-dEk-5+ZYjZpVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here's the steps I'm taking

# pg_restore -l database.dump > database.list

Edit the database.list file and comment out the 'PROCEDURAL LANGUAGE -
plpgsql' item. Because the dump comes from an 8.4 system without
PL/PgSQL installed by default the create language emitted during
pg_restore causes the restore to fail; PL/PgSQL is installed by
default in 9.1.

# pg_restore --jobs=24 --create --dbname=template1
--use-list=database.list --verbose database.dump

After letting the restore run overnight I find the error message in
the log file and still one running 'postmaster' process with the
arguments 'postgresql dbname [local] COPY' still active. The
pg_restore command exited with a zero status as if all went well. I
checked and no other clients are connected to the system so my only
thought is that this is an orphaned instance from the pg_restore.

I'm hoping it's all related because I stopped all the running process
that would normally access the database. I'm in the phase of testing
the restore process because we're new to dealing with multi-terabyte
data and we're trying to figure out our options on supporting with
such large data sets going forward.

I guess I could add the PID of the process generating the error in the
log; if it's the pid of the leftover postgresql process that would be
a bit more definitive. It will probably be tomorrow afternoon before I
have any answer that include the PID given the runtime issues.

-- Brian

On Tue, May 1, 2012 at 12:59 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Brian Weaver <cmdrclueless(at)gmail(dot)com> writes:
>> I think I've discovered an issue with multi-job pg_restore on a 700 GB
>> data file created with pg_dump.
>
> Just to clarify, you mean parallel restore, right?  Are you using any
> options beyond -j, that is any sort of selective restore?
>
>> The problem occurs during the restore when one of the bulk loads
>> (COPY) seems to get disconnected from the restore process. I captured
>> stdout and stderr from the pg_restore execution and there isn't a
>> single hint of a problem. When I look at the log file in the
>> $PGDATA/pg_log directory I found the following errors:
>
>> LOG:  could not send data to client: Connection reset by peer
>> STATEMENT:  COPY public.outlet_readings_rollup (id, outlet_id,
>> rollup_interval, reading_time, min_current, max_current,
>> average_current, min_active_power, max_active_power,
>> average_active_power, min_apparent_power, max_apparent_power,
>> average_apparent_power, watt_hour, pdu_id, min_voltage, max_voltage,
>> average_voltage) TO stdout;
>
> I'm confused.  A copy-to-stdout ought to be something that pg_dump
> would do, not pg_restore.  Are you sure this is related at all?
>
>                        regards, tom lane

--

/* insert witty comment here */

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Brian Weaver 2012-05-01 17:37:18 Re: Problem with multi-job pg_restore
Previous Message Tom Lane 2012-05-01 17:19:48 Re: extending relations more efficiently