Re: CSV mode option for pg_dump

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: CSV mode option for pg_dump
Date: 2006-06-14 12:47:55
Message-ID: 200606141247.k5ECltn03581@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


I agree with Andrew. If there are bugs in CSV, then we should fix them,
not avoid give users this usability improvement. I know I have analyzed
the code and fixed a few problems with it in the past.

As far as pg_dump, I am thinking the most flexible solution would be to
allow an arbitrary WITH clause to be added to COPY, like

pg_dump --copy-with "CSV"

This would allow not only CSV control, but specification of non-CSV
options if desired.

---------------------------------------------------------------------------

Andrew Dunstan wrote:
> Tom Lane said:
> > Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> >> This is just nonsense. There is not the slightest reason that CSV data
> >> cannot be embedded in a text dump nor exist as the data members of a
> >> custom or tar dump with the corresponding COPY statements modified
> >> accordingly.
> >
> > Well, the really *core* question here is whether we trust the stability
> > of the CSV format definition (and code) enough to want to rely on it
> > for data dump/restore purposes. I'm still a few years away from that,
> > myself. AFAICT the raison d'etre of the CSV code is "emit whatever it
> > takes to satisfy this, that, and the other broken Microsoft
> > application". That's fine as an export tool, but as a dump/reload tool,
> > nyet. If you put it in pg_dump you're just handing neophytes another
> > foot-gun.
> >
>
>
> Well, I'm staggered. Really.
>
> The CSV definition and its lack of formality is a complete red herring in
> this, as are references to Microsoft.
>
> The real issue in what you say here is your suggestion that we might not be
> able to reproduce the input in some cases via a COPY CSV round trip. If that
> is so it's a bug of our (probably my) making, and must be fixed. I assert
> that it is not so. In fact all the tests I did during development and since
> were premised on recovering the input exactly. The only CSV option that
> breaks things in that respect is FORCE NOT NULL, and it is designed for data
> coming in from a non Postgres source, so it's not relevant here. Even FORCE
> QUOTE won't break things because it never quotes a null value, and the only
> semantic significance to us of quoting is the null-ness of the value.
>
> If the code is broken then it should be discoverable by test or code
> analysis. There is no need to refer to any other application or standard. So
> if you or anyone think there is a case that will not reproduce the data
> exactly when the same CSV options are used for output and input, I challenge
> you or them to provide a single example.
>
> You say you're a few years away from trusting the code. Well, it's not so
> huge that it's beyond analysis, and I'll be happy to explain anything that
> puzzles you. Perhaps more importantly, it has been in use now for 18 months.
> We discovered one problem with embedded line feeds very early in the 8.0
> release cycle, and fixed it. After that I have not heard of a single
> problem. And I assure you this code is widely used.
>
> It probably isn't used much as a round trip mechanism, probably in part
> because we haven't provided it as a pg_dump option. So maybe we have a
> chicken/egg scenario here. We do have some round trip regression tests in
> the copy test, and those can be beefed up if necessary to increase your
> confidence level.
>
> I'm happy to debate details, but general assertions of "we can't trust this
> code" don't seem worth much to me.
>
> cheers
>
> andrew
>
>
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2006-06-14 13:18:31 Re: [PATCHES] PL/pgSQL: SELECT INTO EXACT
Previous Message Andrew Dunstan 2006-06-14 11:22:52 Re: CSV mode option for pg_dump