Re: BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4

From: Alex Hunsaker <badalex(at)gmail(dot)com>
To: Tim Bunce <Tim(dot)Bunce(at)pobox(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4
Date: 2010-02-19 16:18:01
Message-ID: 34d269d41002190818t3df89d49h15e056d3c95f310@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Feb 19, 2010 at 02:30, Tim Bunce <Tim(dot)Bunce(at)pobox(dot)com> wrote:
> On Thu, Feb 18, 2010 at 11:32:38AM -0700, Alex Hunsaker wrote:
> > On Thu, Feb 18, 2010 at 11:09, Tim Bunce <Tim(dot)Bunce(at)pobox(dot)com> wrote:
> > *PLPerl::utf8::SWASHNEW = \&utf8::SWASHNEW;
> >
> > Hrm... It seems to work for me in HEAD and AFAICS we dont have that
> > line. Did I just miss it? Or did you happen to fix it in another way
> > with your refactoring?

> To be honest I'm not sure. I plan to look into that today.

My hunch is it has to do with the require strict; require feature;
That's the only major difference I see (other than the require_op and
it being in its own package/file).

>> I did a few quick tests but it failed miserably for me...  Im also not
>> fond of adding yet another closure. :)
>
> No amount of closure wrapping will fix the problem.

Yeah, brain fart... That's essentially what Safe.pm does now (and why
there is a problem :) )

>> Makes me think we might just be able to share some of utf8 package in the safe?
>
> I tried. The perl utf8.c code does a method lookup of SWASHNEW to decide
> if the utf8 module has been loaded. So if SWASHNEW is shared _before_
> utf8 is loaded *and used* then the method lookup works (it finds the
> shared stub) and the utf8 module never gets loaded.

Hrm... That seems wrong to me. Let me see If I can explain why. The
below is what you seem to be saying:

package utf8;
sub import { # or maybe this is a BEGIN
return if(\&{'utf8::SWASHNEW'}; # already loaded
# ok not loaded open the Unicode database and do junk which will
'trap' in safe
do 'utf8_heavy.pl';
}

So if we define SWASHNEW without loading the unicode database how will
utf8/unicode work exactly? I guess as long as it gets loaded at some
point it works. So for postgres because we do the utf8 fix after
Safe->new and at that point we cant have any 'bad' strings, it will
work. (with your hack). Sound right?

It seems to me a more correct fix would be to require utf8; inside of
the safe like we do strict. Sorry thats a bit handwavy. You have
obviously spent more time then me looking into this...

Im thinking (in pseudo code):

#define SAFE_OK
....
sub ::mksafefunc {
permit->(qw(caller require));
reval->('require utf8; 1;');
deny->(qw(caller require));
...
}
sub ::mk_strict_safefunc {
...reval->('use strict; require utf8;)

}

static void
plperl_safe_init
{
if (GetDatabaseEncoding() == PG_UTF8)
{
eval_pv("my $a=pack('U',0xC4); $a =~ /\\xE4\\d/i;", FALSE);
}

eval_pv(SAFE_MODULE, FALSE);
eval_pv(SAFE_OK, FALSE)
}

One thing that stinks is while we might not do the utf8fix if we are
not PG_UTF8 we would always require utf8;. And I dont see an easy way
around that in 8.4 :( Also note that is all entirely untested :( If
you think its sane (and it might not be) Im happy to work up a patch.
Id favor this approach as if you have utf8 strings the likely hood
that you want ::upgrade, ::downgrade, ::encode, ::valid or ::is_utf8
is fairly high. Then again, no one has complained thus far... Maybe
thats just me :)

Thoughts?

Anywhoo I cant reproduce this outside of postgres. Maybe you can give
me a pointer?

use Safe();
binmode(STDOUT, ':utf8');
print $Safe::VERSION . "\n";
my $safe = Safe->new('t');
$safe->permit('print');
$safe->reval('sub { print "\x{263a}\n"; }')->();
print $@ ."\n" if($@);
-----
2.22

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alex Hunsaker 2010-02-19 16:30:08 Re: BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4
Previous Message Tim Bunce 2010-02-19 13:06:17 Re: BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4