fix CSV multiline parsing - proof of concept

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject: fix CSV multiline parsing - proof of concept
Date: 2005-02-06 16:15:56
Message-ID: 420642BC.4000806@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches


Attached is a proof-of-concept patch (i.e. not intended for application
just yet) to fix the problem of parsing CSV multiline fields.

Originally I indicated that the way to solve this IMHO was to the
combine reading and parsing phases of COPY for CSV. However, there's a
lot going on there and I adopted a somewhat less invasive approach,
which detects if a CR and/orNL should be part of a data value and if so
treats it as just another character. Also, it removes the escaping
nature of backslash for NL and CR in CSV, which is clearly a bug.

One thing I noticed is that (unless I misread the code) our standard
detection of the end marker \.<EOL> doesn't seem to require that it be
at the beginning of a line, as the docs say it should. I didn't change
that but did build a test for it into the special CSV code.

comments welcome.

cheers

andrew

Attachment Content-Type Size
copy-csv-multiline.patch text/x-patch 8.8 KB

Browse pgsql-patches by date

  From Date Subject
Next Message Neil Conway 2005-02-07 05:55:37 Re: WIP: pl/pgsql cleanup
Previous Message Bruce Momjian 2005-02-05 23:51:52 Re: libpq API incompatibility between 7.4 and 8.0