[2002-05-27 19:28] Tom Lane said:
| Neil Conway <nconway(at)klamath(dot)dyndns(dot)org> writes:
| > Can you elaborate on exactly how you'd like to see COPY's behavior
| > changed? What's the rationale for disallowing missing fields in
| > COPY input?
| Robustness. We've seen problem reports in the past when COPY got out of
| sync with the data stream because of loss or insertion of a newline (eg,
| due to faulty quoting logic in an application; this scenario is not as
| improbable as you might think). Rejecting lines that don't seem to have
| the right number of fields would go a long way towards detecting that
| sort of mistake. And, since the existing behavior with unexpected
| numbers of fields has never been documented, no one feels any strong
| compunction about changing it.
| This has been hashed out in several prior threads, and in fact I think
| Brent Verner has a nearly-ready-to-go patch implementing the agreed-on
| behavior. Now that I think about it, your patch is heading off in a
| quite different direction from what was agreed to: we were never
| intending that omitting trailing fields would be the cue for inserting
| default values. Without a way to specify a target column list, that's
| far too restrictive; and with it, it's unnecessary.
attached is an updated patch (copy.colspec.20020527.diff), against
HEAD, implementing the following COPY syntax
COPY sleep (n,o,w) FROM stdin;
where (n,o,w) may be a subset of columns in the "sleep" table.
If a column is not specified, and has an attribute default, the
default value will be evaluated and used. There is no support for
COPY (<attlist>) TO yet, but that should not be too hairy. A second
patch for pg_dump to support this syntax is attached, but I have
not even tried to apply it, sorry. The pg_dump patch is against
a circa 7.2-RELEASE cvs head. I attach it in case someone else
wants to help wrap this up. I have been death-marching for 3-4 months
and I'm not sure I see the top of the hill yet...
How it does what it does.
I modified gram.y to add an opt_column_list to CopyStmt. This
List is sent down to DoCopy. If no attlist is present, I build
an attlist from the relation's available attributes, enabling
the old behavior. Once inside CopyFrom, the List attlist is
scanned to build two arrays, one representing the actual index
of the specified column, and another containing available default
exprs. Attributes are read from the input stream and assigned to
their appropriate location in the new tuple. If there are any
defaults available, they are then evaluated and placed in the new
tuple arrays. That's really it. I'm not _at all_ confident that
there is not a better way to do this, but this is what I've done.
The patch will elog(ERROR) in DoCopy, before Copy(To|From) is
entered, if a column list is specified that cannot be handled
(non-existent columns, duplicate columns). Should I be waiting
until I'm _in_ Copy(To|From) to elog(ERROR) out?
I'll try to add some commentary to that snakey code over the next
week, and will resubmit an updated patch.
Anyone questions regarding this patch should be sent 'To:' me and
'CC:' the list, as I haven't had the luxury of reading the good
list lately :-(.
comments/brain-death-fixes are welcome.
"Develop your talent, man, and leave the world something. Records are
really gifts from people. To think that an artist would love you enough
to share his music with anyone is a beautiful thing." -- Duane Allman
In response to
pgsql-patches by date
|Next:||From: Joe Conway||Date: 2002-05-28 05:00:25|
|Subject: Re: revised sample SRF C function; proposed SRF API|
|Previous:||From: Hannu Krosing||Date: 2002-05-28 03:54:14|
|Subject: Re: SRF rescan testing|