Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY
Date: 2012-11-14 11:41:20
Message-ID: CA+U5nMLTWhyis-V8eBrZKbuhE3HW=RsQc4rshCcVa_JcU6PNUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 13 September 2012 10:13, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> I'd like to add the following options to the SQL COPY command and the psql \copy
> instruction:
>
> * PREPROCESSOR: Specifies the user-supplied program for COPY IN. The data
> from an input file is preprocessed by the program before the data is loaded into
> a postgres table.
> * POSTPROCESSOR: Specifies the user-supplied program for COPY OUT. The data
> from a postgres table is postprocessed by the program before the data is stored
> in an output file.
>
> These options can be specified only when an input or output file is specified.
>
> These options allow to move data between postgres tables and e.g., compressed
> files or files on a distributed file system such as Hadoop HDFS.

These options look pretty strange to me and I'm not sure they are a good idea.

If we want to read other/complex data, we have Foreign Data Wrappers.

What I think we need is COPY FROM (SELECT....). COPY (query) TO
already exists, so this is just the same thing in the other direction.
Once we have a SELECT statement in both directions we can add any user
defined transforms we wish implemented as database functions.

At present we only support INSERT SELECT ... FROM FDW
which means all the optimisations we've put into COPY are useless with
FDWs. So we need a way to speed up loads from other data sources.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Abhijit Menon-Sen 2012-11-14 13:11:12 [PATCH] binary heap implementation
Previous Message Etsuro Fujita 2012-11-14 11:30:49 Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY