multiline CSV fields

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: multiline CSV fields
Date: 2004-11-10 23:10:46
Message-ID: 41929FF6.6090407@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


Darcy Buskermolen has drawn my attention to unfortunate behaviour of
COPY CSV with fields containing embedded line end chars if the embedded
sequence isn't the same as those of the file containing the CSV data. In
that case we error out when reading the data in. This means there are
cases where we can produce a CSV data file which we can't read in, which
is not at all pleasant.

Possible approaches to the problem:
. make it a documented limitation
. have a "csv read" mode for backend/commands/copy.c:CopyReadLine() that
relaxes some of the restrictions on inconsistent line endings
. escape embedded line end chars

The last really isn't an option, because the whole point of CSVs is to
play with other programs, and my understanding is that those that
understand multiline fields (e.g. Excel) expect them not to be escaped,
and do not produce them escaped.

So right now I'm tossing up in my head between the first two options. Or
maybe there's another solution I haven't thought of.

Thoughts?

cheers

andrew

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Neil Conway 2004-11-10 23:28:00 Re: CREATE or REPLACE function pg_catalog.*
Previous Message Andrew Sullivan 2004-11-10 22:57:40 Re: Increasing the length of

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2004-11-10 23:53:03 Proposed patch to remove USERLIMIT
Previous Message Tom Lane 2004-11-10 17:20:04 Re: delete obsolete comment