Plan for CSV handling of quotes, NULL

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Plan for CSV handling of quotes, NULL
Date: 2004-04-14 22:51:44
Message-ID: 200404142251.i3EMpiY07491@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Bruce Momjian wrote:
> Wow, that is certainly an excellent point. When we import, we know the
> resulting data type, but spreadsheets don't, and rely on the quoting to
> know what to do with the value.
>
> The zipcode is an excellent example. You can't even test for leading
> zeros because then some spreadsheet values in the column are text and
> some numeric.

I talked to Andrew on IRC and we went over the open CSV issues.

We talked about how we could do quoting for zipcode in TEXT fields and
not quote true numeric values without hardcoding datatypes into the
system. The most tricky case was NUMERIC vs. TEXT with zipcodes.
NUMERIC and TEXT have almost identical pg_type entries, so there is
nothing there to help us.

I found parse_coerce.c::TypeCategory(), which contains information about
which data type oids are in which grouping, e.g. DATETIME, STRING,
NUMERIC, etc. It seems that function, if called with
pg_type.typbasetype could help determine if quotes should be used. My
idea is to skip quotes for NUMERIC and DATETIME types, and quote
everything else. This means that user-defined types will always be
output with quotes, which is probably OK.

So, for open CSV items we have:

o add oid dump/reload
o handle loading of comma-comma into NOT NULL collumns
o handle quoting of TEXT type with zipcodes, etc.

For the NOT NULL cases, I am thinking we can just throw a warning the
first time a comma-comma is loaded into a TEXT column and promote to a
zero-lengh string. If the column is INT, it throws an error.

If we head in this direction, we will not need any additional syntax
except that which is in the patch already.

I have the master version of the patch and made his suggested changes
for the default for 'escape'.

Comments?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2004-04-15 02:27:52 Re: Plan for CSV handling of quotes, NULL
Previous Message Bruce Momjian 2004-04-14 20:59:58 Re: win32 fixes