Re: Fwd: Copy out wording

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fwd: Copy out wording
Date: 2009-09-03 11:19:07
Message-ID: 4A9FA62B.7060808@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgadmin-hackers pgsql-hackers

Magnus Hagander wrote:
> Our documentation for COPY
> (http://www.postgresql.org/docs/8.4/static/sql-copy.html) has the
> following to say:
> "
> The CSV format has no standard way to distinguish a NULL value from
> an empty string. PostgreSQL's COPY handles this by quoting. A NULL is
> output as the NULL string and is not quoted, while a data value
> matching the NULL string is quoted. Therefore, using the default
> settings, a NULL is written as an unquoted empty string, while an
> empty string is written with double quotes (""). Reading values
> follows similar rules. You can use FORCE NOT NULL to prevent NULL
> input comparisons for specific columns.
> "
>
> Shouldn't that be:
> "A NULL is output as the NULL string and is not quoted, while a data
> value matching the empty string is quoted"?
>
> If not, then what really is the difference between a NULL and a NULL string?
>
>

No, it shouldn't. Let's say NULL is represented as "foo". Then a null
between delimiters will be written as

delimiter foo delimiter

while the string "foo" will be

delimiter quotechar foo quotechar delimiter

and an empty non-null string will be

delimiter delimiter

unless you have FORCE QUOTE on for it, in which case it will be

delimiter quotechar quotechar delimiter

We had quite a bit of debate on the shape of CSV output at the time it
was done (during 8.0), and that's what we came up with. It has the
useful property that we can round-trip the data, i.e. we can read back
the data we output without losing information about nulls, no matter
what the NULL string is, something we have always been resistant to
changing.

If you think we could explain it better, by all means have a go at it.
But your proposed change isn't accurate. Here is an illustration of the
above:

andrew=# copy (values (1, 'foo', 2),(3,null,4),(5,'',6 ) ) to
stdout null 'foo' csv header;
column1,column2,column3
1,"foo",2
3,foo,4
5,,6
andrew=# copy (values (1, 'foo', 2),(3,null,4),(5,'',6 ) ) to
stdout null 'foo' csv header force quote column2;
column1,column2,column3
1,"foo",2
3,foo,4
5,"",6

HTH

cheers

andrew

In response to

Responses

Browse pgadmin-hackers by date

  From Date Subject
Next Message Guillaume Lelarge 2009-09-03 11:20:32 New minor release?
Previous Message pgAdmin Trac 2009-09-03 11:18:36 Re: [pgAdmin III] #38: Role expiration date chooser does not display correctly

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2009-09-03 11:22:37 Re: Fwd: Copy out wording
Previous Message Sergey Konoplev 2009-09-03 10:11:02 Re: Feature request: DEFAULT as input value of function argument