Re: Undocumented feature costs a lot of performance in COPY IN

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bill Studenmund <wrstuden(at)netbsd(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Undocumented feature costs a lot of performance in COPY IN
Date: 2001-12-04 20:22:58
Message-ID: 3040.1007497378@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Bill Studenmund <wrstuden(at)netbsd(dot)org> writes:
> One alternative would be to make the code use different paths for the
> just-one and many delimiter cases. But then COPY OUT would need fixing.

Well, it's not clear what COPY OUT should *do* with multiple
alternatives, anyway. Pick one at random? I guess it does that now,
if you consider "always use the first one" as a random choice. The
real problem is that it will only backslash the first one, too. That
means that data emitted with DELIMITERS "|_=", say, will fail to be
reloaded correctly if that same DELIMITERS string is given to COPY IN
--- because any _ or = characters in the data won't be backslashed,
but would need to be to keep COPY IN from treating them as delimiters.

For COPY OUT's purposes, a sensible interpretation of a multicharacter
delimiter string would be that the whole string is emitted as the
delimiter. Eg,

COPY OUT WITH DELIMITERS "<TAB>";

foo<TAB>bar<TAB>baz
...

But as long as COPY IN considers that delimiter spec to mean "any one of
these characters", and not a multicharacter string, we couldn't do that.

If we restrict DELIMITERS strings to be exactly one character for a
release or three, we could think about implementing this idea of
multicharacter delimiter strings later on. Not sure if anyone really
needs it though. In any case, the current behavior is inconsistent.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2001-12-04 20:26:05 Release info updated
Previous Message Bruce Momjian 2001-12-04 20:20:52 Re: Undocumented feature costs a lot of performance in COPY

Browse pgsql-patches by date

  From Date Subject
Next Message Bill Studenmund 2001-12-04 20:31:47 Re: Undocumented feature costs a lot of performance in
Previous Message Bruce Momjian 2001-12-04 20:20:52 Re: Undocumented feature costs a lot of performance in COPY