Quick Links

Re: utf8 COPY DELIMITER?

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	"Jim C(dot) Nasby" <jim(at)nasby(dot)net>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Mark Dilger <pgsql(at)markdilger(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: utf8 COPY DELIMITER?
Date:	2007-04-18 17:09:20
Message-ID:	462650C0.1050601@dunslane.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Jim C. Nasby wrote:
> On Tue, Apr 17, 2007 at 02:28:18PM -0400, Tom Lane wrote:
>
>> I doubt that supporting a single multibyte character would be an
>> interesting extension --- if we wanted to do anything at all there, we'd
>> just generalize the delimiter to be an arbitrary string. But it would
>> certainly slow down COPY by some amount, which is an area where you'll
>> get push-back for performance losses, so you'd need to make a convincing
>> use-case for it.
>>
>
> Couldn't we use a fast code path (what we have now) for the case when
> the delimiter is a single byte? That would allow for multi-character
> delimiters without penalizing those that don't use them.
>
> As for use case, I worked on migrating some stuff out of a MySQL
> database a while ago, and having arbitrary string delimiters would have
> made life easier.
>

The first thing to note is that the COPY code is quite complex and
fragile. Personally, I'd want a heck of a lot of convincing to see it
changed, and your use case looks to me like it would be better handled
by preprocessing using a perl script.

Also, if we accept string delimiters on input, we should also allow them
on output.

cheers

andrew

In response to

Re: utf8 COPY DELIMITER? at 2007-04-18 16:38:06 from Jim C. Nasby

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2007-04-18 17:26:45	Re: [RFC] PostgreSQL Access Control Extension (PGACE)
Previous Message	Jim C. Nasby	2007-04-18 17:01:04	Re: Background LRU Writer/free list