Re: COPY enhancements

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Emmanuel Cecchet <manu(at)asterdata(dot)com>, Emmanuel Cecchet <Emmanuel(dot)Cecchet(at)asterdata(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY enhancements
Date: 2009-10-08 15:50:19
Message-ID: 24759.1255017019@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Subcommitting every single row is going to be really painful,
> especially after Hot Standby goes in and we have to issue a WAL record
> after every 64 subtransactions (AIUI).

Yikes ... I had not been following that discussion, but that sure sounds
like a deal-breaker. For HS, not this. But back to topic:

> Another possible approach, which isn't perfect either, is the idea of
> allowing COPY to generate a single column of output of type text[].
> That greatly reduces the number of possible error cases, and at least
> gets the data into the DB where you can hack on it. But it's still
> going to be painful for some use cases.

Yeah, that connects to the previous discussion about refactoring COPY
into a series of steps that the user can control.

Ultimately, there's always going to be a tradeoff between speed and
flexibility. It may be that we should just say "if you want to import
dirty data, it's gonna cost ya" and not worry about the speed penalty
of subtransaction-per-row. But that still leaves us with the 2^32
limit. I wonder whether we could break down COPY into sub-sub
transactions to work around that...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2009-10-08 15:54:08 Re: Writeable CTEs and side effects
Previous Message David Fetter 2009-10-08 15:36:49 Re: Concurrency testing