Re: An idea for parallelizing COPY within one backend

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: "A(dot)M(dot)" <agentm(at)themactionfaction(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: An idea for parallelizing COPY within one backend
Date: 2008-02-27 16:41:02
Message-ID: 29436.1204130462@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Yeah, but it wouldn't take advantage of, say, the hack to disable WAL
> when the table was created in the same transaction.

In the context of a parallelizing pg_restore this would be fairly easy
to get around. We could probably teach the thing to combine table
creation and loading steps into one action (transaction) in most cases.
If that couldn't work because of some weird dependency chain, the
data loading transaction could be done as

BEGIN;
TRUNCATE table;
COPY table FROM stdin;
...
COMMIT;

which I believe already invokes the no-WAL optimization, and could
certainly be made to do so if not.

Obviously, pg_restore would have to be aware of whether or not it had
created that table in the current run (else it mustn't TRUNCATE),
but it would be tracking more or less exactly that info anyway to handle
dependency ordering.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2008-02-27 16:46:08 Re: An idea for parallelizing COPY within one backend
Previous Message Hiroshi Saito 2008-02-27 16:34:03 Re: OSSP can be used in the windows environment now!