Re: Updated COPY CSV patch

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Updated COPY CSV patch
Date: 2004-04-14 19:13:23
Message-ID: 200404141913.i3EJDNi02660@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Andrew Dunstan wrote:
> Bruce Momjian wrote:
>
> >
> >Do we need control for each column? What if we go with preferring NULL
> >for comma-comma, and then print warnings for NOT NULL columns and try
> >the promote. If you want comma-comma to be zero-length string, you can
> >create the column with NOT NULL, load the file, then ALTER TABLE to
> >allow NULL's again. Basically, the NOT NULL specification on the column
> >is the COPY CSV control method, rather than having it be in COPY.
> >
> >
> >
>
>
> If we can't do type specific stuff then we need to be able to have
> column-specific controls on export, at least.
>
> Consider a text column containing US 5-digit ZIP codes. If they are not
> quoted, a spreadsheet will almost certainly not preserve the leading
> zero some of them have, producing very undesirable results. However, a
> genuine numeric-type field must not be quoted, or the same spreadsheet
> won't see that value as a number. Unless we do stuff based on type, we
> have no way of knowing from the text representation of the data what we
> really need. Thus my proposal from this morning for column-specific user
> control over this aspect. And if we are going to have per column user
> control on export, why not on import too, to handle the NOT NULL
> problem? It might make life easier for us code-wise than chasing down
> nullability (e.g. in domains).

Wow, that is certainly an excellent point. When we import, we know the
resulting data type, but spreadsheets don't, and rely on the quoting to
know what to do with the value.

The zipcode is an excellent example. You can't even test for leading
zeros because then some spreadsheet values in the column are text and
some numeric.

We do have a column list capability with COPY already:

COPY tablename [ ( column [, ...] ) ]
FROM { 'filename' | STDIN }

Maybe we should extend that to control quoting on export and NULL
handling on import. Does that solve our problems?

FYI, do you have IM? I am:

AIM bmomjian
ICQ 151255111
Yahoo bmomjian
MSN root(at)candle(dot)pha(dot)pa(dot)us
IRC bmomjian via FreeNode or EFNet

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2004-04-14 20:59:58 Re: win32 fixes
Previous Message Andrew Dunstan 2004-04-14 18:18:29 Re: Updated COPY CSV patch