Quick Links

fix CSV multiline parsing - proof of concept

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	"Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject:	fix CSV multiline parsing - proof of concept
Date:	2005-02-06 16:15:56
Message-ID:	420642BC.4000806@dunslane.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-patches

Attached is a proof-of-concept patch (i.e. not intended for application
just yet) to fix the problem of parsing CSV multiline fields.

Originally I indicated that the way to solve this IMHO was to the
combine reading and parsing phases of COPY for CSV. However, there's a
lot going on there and I adopted a somewhat less invasive approach,
which detects if a CR and/orNL should be part of a data value and if so
treats it as just another character. Also, it removes the escaping
nature of backslash for NL and CR in CSV, which is clearly a bug.

One thing I noticed is that (unless I misread the code) our standard
detection of the end marker \.<EOL> doesn't seem to require that it be
at the beginning of a line, as the docs say it should. I didn't change
that but did build a test for it into the special CSV code.

comments welcome.

cheers

andrew

Attachment	Content-Type	Size
copy-csv-multiline.patch	text/x-patch	8.8 KB

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Neil Conway	2005-02-07 05:55:37	Re: WIP: pl/pgsql cleanup
Previous Message	Bruce Momjian	2005-02-05 23:51:52	Re: libpq API incompatibility between 7.4 and 8.0