Re: Correctly producing array literals for prepared statements

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Peter Geoghegan <peter(dot)geoghegan86(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Correctly producing array literals for prepared statements
Date: 2011-02-23 15:36:01
Message-ID: 4D652961.2060709@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/23/2011 10:22 AM, Heikki Linnakangas wrote:
> On 23.02.2011 17:16, Andrew Dunstan wrote:
>> On 02/23/2011 10:09 AM, Peter Geoghegan wrote:
>>> On 23 February 2011 04:36, Greg Stark<gsstark(at)mit(dot)edu> wrote:
>>>> This is only true for server encodings. In a client library I think
>>>> you lose on this and do have to deal with it. I'm not sure what client
>>>> encodings we do support that aren't ascii-supersets though, it's
>>>> possible none of them generate quote characters this way.
>>> I'm pretty sure all of the client encodings Tatsuo mentions are ASCII
>>> supersets. The absence of by far the most popular non-ASCII superset
>>> encoding, UTF-16, as a client encoding indicated that to me. It isn't
>>> byte oriented, and Postgres is.
>>
>> They are not. It's precisely because they are not that they are not
>> allowed as server encodings.
>
> To be precise, they are all ASCII supersets in the sense that a valid
> 7-bit ASCII string is valid and means the same thing in all of the
> client-only encodings as well. The difference between supported
> server-encodings and those that are only supported as client_encoding
> is whether *all* bytes in a multi-byte character have the high bit
> set. All server-encodings have that property, and we rely on it in the
> backend. In the supported client-only encodings, the *first* byte of a
> multi-byte character is guaranteed to have the high bit set, but the
> subsequent bytes are not.

Yes, that's a better explanation.

>
> Even that more loose property isn't true for UTF-16, which is why we
> don't support it even as a client-only encoding.

The fact that UTF-16 uses nul bytes would make it particularly hard to
handle.

There might be value in having a UTF-16 aware version of libpq that
would translate strings into UTF-8 on the way to the server and to
UTF-16 on the way back to the client.

cheers

andrew

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kenneth Marshall 2011-02-23 15:40:30 Re: Correctly producing array literals for prepared statements
Previous Message Merlin Moncure 2011-02-23 15:34:06 Re: Correctly producing array literals for prepared statements