Re: Using multi-row technique with COPY

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Using multi-row technique with COPY
Date: 2005-11-29 20:58:22
Message-ID: 9872.1133297902@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> Tom Lane wrote:
>> Log, yes, unless it's a temp table. The point is we could avoid taking
>> buffer content locks. Come to think of it, we could implement that
>> trivially in the heapam.c routines; it would then apply to any table
>> update whether generated by COPY or otherwise.

> We should do that, yes, we've discussed that before.

Not that I recall ... anyway, I had second thoughts about this while
eating lunch. A buffer for a new table can reasonably be assumed not to
be accessed by any other backend, but we can *not* make that assumption
for the bgwriter. This means that the bgwriter might come along and
dump the buffer to disk while we are in the midst of scribbling on it.
Since we'll mark the thing dirty after we finish scribbling, no harm
is done --- unless there is a crash after we commit and before the next
write of the buffer occurs. In that case we have a corrupt copy of the
buffer on disk, which is going to pose a problem for recovery. (As an
example, the corrupt copy might have an invalid LSN that is larger than
it really should be, which will cause us to ignore the WAL record that
might have otherwise let us fix the problem.)

As far as I can see at the moment, such a problem will be repaired
during recovery *as long as you have the full-page-writes option on*.
Otherwise not. So we'd probably have to restrict the optimization to
happen only if that's on, as well as only for new tables.

>> Exactly my point; we don't have to change any syntax, so pg_dump
>> doesn't care.

> Does pg_dump wrap the CREATE TABLE and COPY into the same transaction?

No, but you can manually put BEGIN and END around the whole script.
Some people do that anyway. (It wouldn't be a bad idea to add an option
to pg_restore to do it, since there's no very convenient way to add
begin/end manually when piping pg_restore right into a database.)

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2005-11-29 21:10:45 Re: Using multi-row technique with COPY
Previous Message Simon Riggs 2005-11-29 20:42:49 Re: Using multi-row technique with COPY