Skip site navigation (1) Skip section navigation (2)

Re: CopyReadLineText optimization

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: CopyReadLineText optimization
Date: 2008-03-06 19:24:19
Message-ID: 47D044E3.3030604@dunslane.net (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-patches

Heikki Linnakangas wrote:
> Andrew Dunstan wrote:
>> Heikki Linnakangas wrote:
>>> Another update attached: It occurred to me that the memchr approach is
>>> only safe for server encodings, where the non-first bytes of a 
>>> multi-byte character always have the hi-bit set.
>>>
>>
>> We currently make the following assumption in the code:
>>
>>     * These four characters, and the CSV escape and quote characters, 
>> are
>>     * assumed the same in frontend and backend encodings.
>>     *
>>
>> The four characters are the carriage return, line feed, backslash and 
>> dot.
>>
>> I think the requirement might well actually be somewhat stronger than 
>> that: i.e. that none of these will appear as a non-first byte in any 
>> multi-byte client encoding character. If that's right, then we should 
>> be able to write CopyReadLineText without bothering about multi-byte 
>> chars. If it's not right then I suspect we have some cases that can 
>> fail now anyway.
>
> No, we don't require that, and we do handle it correctly. We use 
> pg_encoding_mblen to determine the length of each character in 
> CopyReadLineText when the encoding is a client-only encoding, and only 
> look at the first byte of each character. In CopyReadAttributesText, 
> where we have a similar loop, we've already transformed the input to 
> server encoding.

Oops. I see that now. Funny how I missed it when I went looking for it :-(

I think I understand the patch now :-)

I'm still a bit worried about applying it unless it gets some adaptive 
behaviour or something so that we don't cause any serious performance 
regressions in some cases. Also, could we perhaps benefit from inlining 
some calls, or is your compiler doing that anyway?

cheers

andrew

In response to

Responses

pgsql-hackers by date

Next:From: Andrew DunstanDate: 2008-03-06 19:29:24
Subject: Re: CopyReadLineText optimization
Previous:From: Tom LaneDate: 2008-03-06 19:21:03
Subject: Re: CopyReadLineText optimization

pgsql-patches by date

Next:From: Andrew DunstanDate: 2008-03-06 19:29:24
Subject: Re: CopyReadLineText optimization
Previous:From: Tom LaneDate: 2008-03-06 19:21:03
Subject: Re: CopyReadLineText optimization

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group