Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.

From: Amit Khandekar <amit(dot)khandekar(at)enterprisedb(dot)com>
To: Alex Hunsaker <badalex(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.
Date: 2011-10-04 09:09:44
Message-ID: CACoZds1kqNZrr_6SWGfNCbHUpMDC0S_NP4zYu84MpMXw03tnfQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 4 October 2011 14:04, Alex Hunsaker <badalex(at)gmail(dot)com> wrote:
> On Mon, Oct 3, 2011 at 23:35, Amit Khandekar
> <amit(dot)khandekar(at)enterprisedb(dot)com> wrote:
>
>> WHen GetDatabaseEncoding() != PG_UTF8 case, ret will not be equal to
>> utf8_str, so pg_verify_mbstr_len() will not get called. That's the
>> reason, pg_verify_mbstr_len() is under the ( ret == utf8_str )
>> condition.
>
> Consider a latin1 database where utf8_str was a string of ascii
> characters. Then no conversion would take place and ret == utf8_str
> but the string would be verified by pg_do_encdoing_conversion() and
> verified again by your added check :-).
>
>>> It might be worth adding a regression test also...
>>
>> I could not find any basic pl/perl tests in the regression
>> serial_schedule. I am not sure if we want to add just this scenario
>> without any basic tests for pl/perl ?
>
> I went ahead and added one in the attached based upon your example.
>
> Look ok to you?
>

+ if(GetDatabaseEncoding() == PG_UTF8)
+ pg_verify_mbstr_len(PG_UTF8, utf8_str, len, false);

In your patch, the above will again skip mb-validation if the database
encoding is SQL_ASCII. Note that in pg_do_encoding_conversion returns
the un-converted string even if *one* of the src and dest encodings is
SQL_ASCII.

I think :
if (ret == utf8_str)
+ {
+ pg_verify_mbstr_len(PG_UTF8, utf8_str, len, false);
ret = pstrdup(ret);
+ }

This (ret == utf8_str) condition would be a reliable way for knowing
whether pg_do_encoding_conversion() has done the conversion at all.

I am ok with the new test. Thanks for doing it yourself !

> BTW thanks for the patch!
>
> [ side note ]
> I still think we should not be doing any conversion in the SQL_ASCII
> case but this slimmed down patch should be less controversial.
>

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2011-10-04 16:37:28 pgsql: Remove the custom_variable_classes parameter.
Previous Message Alex Hunsaker 2011-10-04 08:34:14 Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2011-10-04 09:46:43 Re: Double sorting split patch
Previous Message Simon Riggs 2011-10-04 09:05:59 Re: Separating bgwriter and checkpointer