Quick Links

Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.

From:	Alex Hunsaker <badalex(at)gmail(dot)com>
To:	Amit Khandekar <amit(dot)khandekar(at)enterprisedb(dot)com>
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.
Date:	2011-10-05 06:30:01
Message-ID:	CAFaPBrSrsKFL7tJ2HM1Z6UvsjMGv19Q2vkDQi=rXSnYfE=Mv5w@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-committers pgsql-hackers

On Tue, Oct 4, 2011 at 23:46, Amit Khandekar
<amit(dot)khandekar(at)enterprisedb(dot)com> wrote:
> On 4 October 2011 22:57, Alex Hunsaker <badalex(at)gmail(dot)com> wrote:
>> On Tue, Oct 4, 2011 at 03:09, Amit Khandekar
>> <amit(dot)khandekar(at)enterprisedb(dot)com> wrote:
>>> On 4 October 2011 14:04, Alex Hunsaker <badalex(at)gmail(dot)com> wrote:
>>>> On Mon, Oct 3, 2011 at 23:35, Amit Khandekar
>>>> <amit(dot)khandekar(at)enterprisedb(dot)com> wrote:
>>>>
>>>>> WHen GetDatabaseEncoding() != PG_UTF8 case, ret will not be equal to
>>>>> utf8_str, so pg_verify_mbstr_len() will not get called. [...]
>>>>
>>>> Consider a latin1 database where utf8_str was a string of ascii
>>>> characters. [...]
>>
>>>> [Patch] Look ok to you?
>>>>
>>>
>>> + if(GetDatabaseEncoding() == PG_UTF8)
>>> + pg_verify_mbstr_len(PG_UTF8, utf8_str, len, false);
>>>
>>> In your patch, the above will again skip mb-validation if the database
>>> encoding is SQL_ASCII. Note that in pg_do_encoding_conversion returns
>>> the un-converted string even if *one* of the src and dest encodings is
>>> SQL_ASCII.
>>
>> *scratches head* I thought the point of SQL_ASCII was no encoding
>> conversion was done and so there would be nothing to verify.
>>
>> Ahh I see looks like pg_verify_mbstr_len() will make sure there are no
>> NULL bytes in the string when we are a single byte encoding.
>>
>>> I think :
>>> if (ret == utf8_str)
>>> + {
>>> + pg_verify_mbstr_len(PG_UTF8, utf8_str, len, false);
>>> ret = pstrdup(ret);
>>> + }
>>>
>>> This (ret == utf8_str) condition would be a reliable way for knowing
>>> whether pg_do_encoding_conversion() has done the conversion at all.
>>
>> Yes. However (and maybe im nitpicking here), I dont see any reason to
>> verify certain strings twice if we can avoid it.
>>
>> What do you think about:
>> + /*
>> + * when we are a PG_UTF8 or SQL_ASCII database pg_do_encoding_conversion()
>> + * will not do any conversion or verification. we need to do it
>> manually instead.
>> + */
>> + if( GetDatabaseEncoding() == PG_UTF8 ||
>> GetDatabaseEncoding() == SQL_ASCII)
>> + pg_verify_mbstr_len(PG_UTF8, utf8_str, len, false);
>>
>
> You mean the final changes in plperl_helpers.h would look like
> something like this right? :
>
> static inline char *
> utf_u2e(const char *utf8_str, size_t len)
> {
> char *ret = (char *) pg_do_encoding_conversion((unsigned
> char *) utf8_str, len, PG_UTF8, GetDatabaseEncoding());
>
> if (ret == utf8_str)
> + {
> + if (GetDatabaseEncoding() == PG_UTF8 ||
> + GetDatabaseEncoding() == PG_SQL_ASCII)
> + {
> + pg_verify_mbstr_len(PG_UTF8, utf8_str, len, false);
> + }
> +
> ret = pstrdup(ret);
> + }
> return ret;
> }

Yes.

> Yeah I am ok with that. It's just an additional check besides (ret ==
> utf8_str) to know if we really require validation.
>

In response to

Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding. at 2011-10-05 05:46:02 from Amit Khandekar

Responses

Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding. at 2011-10-05 06:59:47 from Alex Hunsaker

Browse pgsql-committers by date

	From	Date	Subject
Next Message	Alex Hunsaker	2011-10-05 06:59:47	Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.
Previous Message	Amit Khandekar	2011-10-05 05:46:02	Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Greg Smith	2011-10-05 06:52:24	Re: Displaying accumulated autovacuum cost
Previous Message	Heikki Linnakangas	2011-10-05 06:27:59	Re: Action requested - Application Softblock implemented \| Issue report ID341057