Updated COPY CSV patch

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Updated COPY CSV patch
Date: 2004-04-13 04:20:14
Message-ID: 200404130420.i3D4KEE02485@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Andrew Dunstan wrote:
> If the null marker is not an empty string, it gets an error, of
> course - if it is it gets a null:
>
> [andrew(at)marmaduke pginst]$ echo ',,' | bin/psql -c "create temp
> table foo (a int, b text, c text); copy foo from stdin delimiter
> ',\"' null '\\\\N';" ERROR: invalid input syntax for integer:
> "" CONTEXT: COPY foo, line 1, column a: "" [andrew(at)marmaduke
> pginst]$ echo ',,' | bin/psql -c "create temp table foo (a int,
> b text, c text); copy foo from stdin delimiter ',\"' ;"
> [andrew(at)marmaduke pginst]$
>
>
> I hope that is expected behaviour - it's what *I* expect, at
> least.
> >
>
> Attached patch has these additions to previously posted patch:
> . quote character may not appear in NULL marker
> . any non-null value that matches the NULL marker is forced to be quoted
> when written.

OK, here is a new version of the patch that includes the grammar
changes we agreed upon, SGML changes, and \copy support. I will not
make any more changes without contacting you so feel free to make
adjustments and repost.

I have two open issues. First, CSV should support WITH OIDS, no?

Second, I found a problem with NULLs. If I do:
.
test=> create table test (x text, y text);
CREATE TABLE
test=> insert into test values ('', NULL);
INSERT 17221 1
test=>

then this:

test=> copy test to '/tmp/b' with csv;

creates:

"",

and this:

test=> copy test to '/tmp/b' with csv NULL 'fred';

creates:

,fred

Is that logical? A non-null field went from "" to nothing.

I think it is caused by this code:

bool force_quote = (strcmp(string, null_print) == 0);
CopyAttributeOutCSV(string, delim, quote, escape,
force_quote);

The reason it happens is that when the null string is '', it matches a
zero-length string, so the value is quoted. When the null stirng isn't
blank, a zero-length string doesn't match the null string so it isn't
quoted. I think we need to add special logic for zero-length strings
so they are always quoted, even if there is a special null string. This
will make our dumps more consistent, I think, or maybe the current
behavior is OK. It just struck me as strange.

I did a dump/reload test with a null string and null, and it worked
fine.

Is there any data that can not be dumped/reloaded via CSV?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

Attachment Content-Type Size
unknown_filename text/plain 32.5 KB

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Claudio Natoli 2004-04-13 06:17:03 win32 fixes
Previous Message Sean Chittenden 2004-04-12 20:05:55 Re: Information/schema hiding...