Re: COPY enhancements

From: Emmanuel Cecchet <manu(at)frogthinker(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Emmanuel Cecchet <manu(at)asterdata(dot)com>, Emmanuel Cecchet <Emmanuel(dot)Cecchet(at)asterdata(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY enhancements
Date: 2009-10-13 13:57:44
Message-ID: 4AD48758.7090502@frogthinker.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Ultimately, there's always going to be a tradeoff between speed and
> flexibility. It may be that we should just say "if you want to import
> dirty data, it's gonna cost ya" and not worry about the speed penalty
> of subtransaction-per-row. But that still leaves us with the 2^32
> limit. I wonder whether we could break down COPY into sub-sub
> transactions to work around that...
>
Regarding that tradeoff between speed and flexibility I think we could
propose multiple options:
- maximum speed: current implementation fails on first error
- speed with error logging: copy command fails if there is an error but
continue to log all errors
- speed with error logging best effort: no use of sub-transactions but
errors that can safely be trapped with pg_try/catch (no index violation,
no before insert trigger, etc...) are logged and command can complete
- pre-loading (2-phase copy): phase 1: copy good tuples into a [temp]
table and bad tuples into an error table. phase 2: push good tuples to
destination table. Note that if phase 2 fails, it could be retried since
the temp table would be dropped only on success of phase 2.
- slow but flexible: have every row in a sub-transaction -> is there any
real benefits compared to pg_loader?

Tom was also suggesting 'refactoring COPY into a series of steps that
the user can control'. What would these steps be? Would that be per row
and allow to discard a bad tuple?

Emmanuel

--
Emmanuel Cecchet
FTO @ Frog Thinker
Open Source Development & Consulting
--
Web: http://www.frogthinker.org
email: manu(at)frogthinker(dot)org
Skype: emmanuel_cecchet

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2009-10-13 14:14:04 Re: Re: [GENERAL] contrib/plantuner - enable PostgreSQL planner hints
Previous Message Peter Eisentraut 2009-10-13 10:28:10 Re: SQL Standard Committee