backslash-dot quoting in COPY CSV

From: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: backslash-dot quoting in COPY CSV
Date: 2019-01-02 15:58:35
Message-ID: 10e3eff6-eb04-4b3f-aeb4-b920192b977a@manitou-mail.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

The doc on COPY CSV says about the backslash-dot sequence:

To avoid any misinterpretation, a \. data value appearing as a
lone entry on a line is automatically quoted on output, and on
input, if quoted, is not interpreted as the end-of-data marker

However this quoting does not happen when \. is already part
of a quoted field. Example:

COPY (select 'somevalue', E'foo\n\\.\nbar') TO STDOUT CSV;

outputs:

somevalue,"foo
\.
bar"

which conforms to the CSV rules, by which we are not allowed
to replace \. by anything AFAICS.
The trouble is, when trying to import this back with COPY FROM,
it will error out at the backslash-dot and not import anything.
Furthermore, if these data are meant to be embedded into a
script, it creates a potential risk of SQL injection.

It is a known issue? I haven't found previous discussions on this.
It looks to me like the ability of backslash-dot to be an end-of-data
marker should be neutralizable for CSV. When the data is not embedded,
it's not needed anyway, and when it's embedded, we could surely think
of alternatives.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark 2019-01-02 16:04:03 Re: Query planner / Analyse statistics bad estimate rows=1 with maximum statistics 10000 on PostgreSQL 10.2
Previous Message Ron 2019-01-02 15:44:57 Re: Query planner / Analyse statistics bad estimate rows=1 with maximum statistics 10000 on PostgreSQL 10.2