Re: Nasty, propagating POLA violation in COPY CSV HEADER

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: David Fetter <david(at)fetter(dot)org>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Nasty, propagating POLA violation in COPY CSV HEADER
Date: 2012-06-20 15:48:21
Message-ID: 4FE1F0C5.70704@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/20/2012 11:02 AM, David Fetter wrote:
> Folks,
>
> A co-worker filed a bug against file_fdw where the columns in a
> FOREIGN TABLE were scrambled on SELECT. It turned out that this comes
> from the (yes, it's documented, but since it's documented in a place
> not obviously linked to the bug, it's pretty useless) "feature" of
> COPY CSV HEADER whereby the header line is totally ignored in COPY
> OUT.
>
> Rather than being totally ignored in the COPY OUT (CSV HEADER) case,
> the header line in should be parsed to establish which columns are
> where and rearranging the output if needed.
>
> I'm proposing to make the code change here:
>
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/commands/copy.c;h=98bcb2fcf3370c72b0f0a7c0df76ebe4512e9ab0;hb=refs/heads/master#l2436
>
> and a suitable doc change that talks about reading the header only for
> the purpose of matching column names to columns, and throwing away the
> output as before.
>
> What say?
>

First you are talking about COPY IN, not COPY OUT, surely.

This is not a bug, it is documented in exactly the place that all other
COPY options are documented. The file_fdw page refers the reader to the
COPY docs for details. Unless you want us to duplicate the entire COPY
docs in the file_fdw page this seems entirely reasonable.

The current behaviour was discussed at some length back when we
implemented the HEADER feature, IIRC, and is quite intentional. I don't
think we should alter the current behaviour, as plenty of people rely on
it, some to my certain knowledge. I do see a reasonable case for adding
a new behaviour which takes notice of the header line, although it's
likely to have plenty of wrinkles.

Reordering columns like you suggest might well have a significant impact
on COPY performance, BTW. Also note that I created the file_text_array
FDW precisely for people who want to be able to cherry pick and reorder
columns. See <https://github.com/adunstan/file_text_array_fdw>

cheers

andrew

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-06-20 15:49:51 Re: libpq compression
Previous Message Tom Lane 2012-06-20 15:47:14 Re: Nasty, propagating POLA violation in COPY CSV HEADER