Re: COPY Transform support

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY Transform support
Date: 2008-04-03 15:00:26
Message-ID: 29848.1207234826@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dimitri Fontaine <dfontaine(at)hi-media(dot)com> writes:
> Here's a proposal for COPY to support the T part of an ETL, that is adding the
> capability for COPY FROM to Transform the data it gets.

> The idea is quite simple: adding to COPY FROM the option to run a function on
> the data before to call datatype_in functions.

The major concern I have about this is to ensure that no detectable
overhead is added to COPY when the feature isn't being used.

I am not actually convinced that the column-by-column design you seem to
have in mind is worth anything. The examples that I remember seeing
often involve removing columns, generating one column from multiple ones
or vice versa, dealing with nonstandard column delimiters, etc. What
would makes sense in my mind is a single function taking and returning
text, which is invoked once on each complete input line before it is
broken into fields.

This is, of course, just a substitute for running a sed or perl or
similar script over the data before feeding it to COPY --- and probably
not an amazingly good substitute at that. For instance, assuming you
like perl for text-wrangling, I'd fully expect the function approach
to be slower than an external script because of the large overhead of
getting into and out of libperl for each line,

In situations where it's actually useful to apply SQL functions rather
than text-mangling operations to the data, you always have the option to
COPY into a temp table and then do INSERT/SELECT from there.

So the whole thing seems just marginally attractive to me.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dimitri Fontaine 2008-04-03 15:06:14 Re: COPY Transform support
Previous Message Csaba Nagy 2008-04-03 14:57:53 Re: COPY Transform support