Re: Should CSV parsing be stricter about mid-field quotes?

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Joel Jacobson <joel(at)compiler(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Should CSV parsing be stricter about mid-field quotes?
Date: 2023-05-12 19:57:06
Message-ID: e819612f-f75f-ec88-0d0c-d63ffb6c8745@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 2023-05-11 Th 10:03, Joel Jacobson wrote:
> Hi hackers,
>
> I've come across an unexpected behavior in our CSV parser that I'd like to
> bring up for discussion.
>
> % cat example.csv
> id,rating,review
> 1,5,"Great product, will buy again."
> 2,3,"I bought this for my 6" laptop but it didn't fit my 8" tablet"
>
> % psql
> CREATE TABLE reviews (id int, rating int, review text);
> \COPY reviews FROM example.csv WITH CSV HEADER;
> SELECT * FROM reviews;
>
> This gives:
>
> id | rating |                           review
> ----+--------+-------------------------------------------------------------
>   1 |      5 | Great product, will buy again.
>   2 |      3 | I bought this for my 6 laptop but it didn't fit my 8 tablet
> (2 rows)

Maybe this is unexpected by you, but it's not by me. What other sane
interpretation of that data could there be? And what CSV producer
outputs such horrible content? As you've noted, ours certainly does not.
Our rules are clear: quotes within quotes must be escaped (default
escape is by doubling the quote char). Allowing partial fields to be
quoted was a deliberate decision when CSV parsing was implemented,
because examples have been seen in the wild.

So I don't think our behaviour is broken or needs fixing. As mentioned
by Greg, this is an example of the adage about being liberal in what you
accept.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathaniel Sabanski 2023-05-12 20:04:00 Re: Adding SHOW CREATE TABLE
Previous Message Pavel Stehule 2023-05-12 19:17:00 Re: psql tests hangs