Re: plperlu problem with utf8 [REVIEW]

From: Andy Colson <andy(at)squeakycode(dot)net>
To: Alex Hunsaker <badalex(at)gmail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plperlu problem with utf8 [REVIEW]
Date: 2011-01-17 04:03:48
Message-ID: 4D33BFA4.8060505@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/16/2011 07:14 PM, Alex Hunsaker wrote:
> On Sat, Jan 15, 2011 at 14:20, Andy Colson<andy(at)squeakycode(dot)net> wrote:
>>
>> This is a review of "plperl encoding issues"
>>
>> https://commitfest.postgresql.org/action/patch_view?id=452
>
> Thanks for taking the time to review!
>
> [...]
>>
>> The Patch:
>> ==========
>> Applies clean to git head as of January 15 2011. PG built with
>> --enable-cassert and --enable-debug seems to run fine with no errors.
>>
>> I don't think regression tests cover plperl, so understandable there are no
>> tests in the patch.
>
> FWI there are plperl tests, you can do 'make installcheck' from the
> plperl dir or installcheck-world from the top. However I did not add
> any as AFAIK there is not a way to handle multiple locales with them
> (at least for the automated case).

oh, cool. I'd kinda thought 'make check' was the one to run. I'll have to checkout 'make check' vs 'make installcheck'.

>> There is no manual updates in the patch either, and I think there should be.
>> I think it should be made clear
>> that data (varchar, text, etc. but not bytea) will be passed to perl as
>> UTF-8, regardless of database encoding
>
> I don't disagree, but I dont see where to put it either. Maybe its
> only release note material?
>

I think this page:
http://www.postgresql.org/docs/current/static/plperl-funcs.html

Right after:
"Arguments and results are handled as in any other Perl subroutine: arguments are passed in @_, and a result value is returned with return or as the last expression evaluated in the function."

Add:

Arguments will be converted from the databases encoding to UTF-8 for use inside plperl, and then converted from UTF-8 back to the database encoding upon return.

OR, that same sentence could be added to the next page:

http://www.postgresql.org/docs/current/static/plperl-data.html

However, this patch brings back DWIM to plperl. It should just work without having to worry about it. I'd be ok either way.

>> Also that "use utf8;" is always loaded and in use.
>
> Sorry, I probably mis-worded that in my original description. Its that
> we always do the 'utf8fix' for plperl. Not that utf8 is loaded and in
> use. This fix basically makes sure the unicode database and associated
> modules are loaded. This is needed because perl will try to
> dynamically load these when you need them. As we restrict 'require' in
> the plperl case, things that depended on that would fail. Previously
> we only did the utf8fix when we were a PG_UTF8 database. I don't
> really think its worth documenting, its more a bug fix than anything
> else.
>

Agreed.

-Andy

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2011-01-17 04:06:53 Re: auto-sizing wal_buffers
Previous Message Jeff Janes 2011-01-17 03:58:01 Re: auto-sizing wal_buffers