From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Daniel Verite <daniel(at)manitou-mail(dot)org> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: backslash-dot quoting in COPY CSV |
Date: | 2019-01-25 03:09:30 |
Message-ID: | 20190125030930.GC22001@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jan 2, 2019 at 04:58:35PM +0100, Daniel Verite wrote:
> Hi,
>
> The doc on COPY CSV says about the backslash-dot sequence:
>
> To avoid any misinterpretation, a \. data value appearing as a
> lone entry on a line is automatically quoted on output, and on
> input, if quoted, is not interpreted as the end-of-data marker
>
> However this quoting does not happen when \. is already part
> of a quoted field. Example:
>
> COPY (select 'somevalue', E'foo\n\\.\nbar') TO STDOUT CSV;
>
> outputs:
>
> somevalue,"foo
> \.
> bar"
>
> which conforms to the CSV rules, by which we are not allowed
> to replace \. by anything AFAICS.
> The trouble is, when trying to import this back with COPY FROM,
> it will error out at the backslash-dot and not import anything.
> Furthermore, if these data are meant to be embedded into a
> script, it creates a potential risk of SQL injection.
>
> It is a known issue? I haven't found previous discussions on this.
> It looks to me like the ability of backslash-dot to be an end-of-data
> marker should be neutralizable for CSV. When the data is not embedded,
> it's not needed anyway, and when it's embedded, we could surely think
> of alternatives.
I was unable to reproduce the failure here using files:
CREATE TABLE test (x TEXT);
INSERT INTO test VALUES (E'foo\n\\.\nbar');
COPY test TO STDOUT CSV;
"foo
\.
bar"
COPY test TO '/u/postgres/tmp/x' CSV;
COPY test FROM '/u/postgres/tmp/x' CSV;
SELECT * FROM test;
x
-----
foo+
\. +
bar
foo+
\. +
bar
but I am able to see the failure using STDIN:
COPY test FROM STDIN CSV;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
"foo
\.
ERROR: unterminated CSV quoted field
CONTEXT: COPY test, line 1: ""foo
This seems like a bug to me. Looking at the code, psql issues the
prompts for STDIN, but when it sees \. alone on a line, it has no idea
you are in a quoted CSV string, so it thinks the copy is done and sends
the result to the server. I can't see an easy way to fix this. I guess
we could document it.
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
From | Date | Subject | |
---|---|---|---|
Next Message | Chapman Flack | 2019-01-25 04:46:29 | Re: PostgreSQL vs SQL/XML Standards |
Previous Message | Amit Kapila | 2019-01-25 02:50:03 | Re: WIP: Avoid creation of the free space map for small tables |