Re: invalidly encoded strings

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Hannu Krosing <hannu(at)skype(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, laurenz(dot)albe(at)wien(dot)gv(dot)at, pgsql-hackers(at)postgresql(dot)org
Subject: Re: invalidly encoded strings
Date: 2007-09-18 13:24:02
Message-ID: 46EFD172.7030408@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Hannu Krosing wrote:
> Ühel kenal päeval, T, 2007-09-18 kell 08:08, kirjutas Andrew Dunstan:
>
>> Tom Lane wrote:
>>
>>> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>>
>>>
>>>> Tom Lane wrote:
>>>>
>>>>
>>>>> What I think we'd need to have a complete solution is
>>>>>
>>>>> convert(text, name) returns bytea
>>>>> -- convert from DB encoding to arbitrary encoding
>>>>>
>>>>> convert(bytea, name, name) returns bytea
>>>>> -- convert between any two encodings
>>>>>
>>>>> convert(bytea, name) returns text
>>>>> -- convert from arbitrary encoding to DB encoding
>>>>>
>>>>> The second and third would need to do a verify step before
>>>>> converting, of course.
>>>>>
>>>>>
>>>> I'm wondering if we should give them disambiguating names, rather than
>>>> call them all convert.
>>>>
>>>>
>>> No. We have a function overloading system, we should use it.
>>>
>>>
>>>
>>>
>> In general I agree with you.
>>
>> What's bothering me here though is that in the two argument forms, if
>> the first argument is text the second argument is the destination
>> encoding, but if the first argument is a bytea the second argument is
>> the source encoding. That strikes me as likely to be quite confusing,
>> and we might alleviate that with something like:
>>
>> text convert_from(bytea, name)
>> bytea convert_to(text, name)
>>
>
> how is this fundamentally different from encode/decode functions we have
> now ?
>
>
>

They are in effect reversed. encode() applies the named encoding to a
bytea. convert_from() above unapplies the named encoding (i.e. converts
the bytea to text in the database encoding).

cheers

andrew

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-09-18 13:52:56 Re: Open issues for HOT patch
Previous Message Hannu Krosing 2007-09-18 12:24:58 Re: invalidly encoded strings

Browse pgsql-patches by date

  From Date Subject
Next Message Gregory Stark 2007-09-18 14:12:35 Re: invalidly encoded strings
Previous Message Jaime Casanova 2007-09-18 12:35:33 Re: HOT version 18