Re: invalid UTF-8 via pl/perl

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hannu Krosing <hannu(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: invalid UTF-8 via pl/perl
Date: 2010-01-03 23:40:40
Message-ID: 4B412AF8.8070700@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>
>> andrew=# select 'a' || invalid_utf_seq() || 'b';
>> ERROR: invalid byte sequence for encoding "UTF8": 0xd0
>> HINT: This error can also happen if the byte sequence does not
>> match the encoding expected by the server, which is controlled by
>> "client_encoding".
>> CONTEXT: PL/Perl function "invalid_utf_seq"
>>
>
>
>> That hint seems rather misleading. I'm not sure what we can do about it
>> though. If we set the noError param on pg_verifymbstr() we would miss
>> the error message that actually identified the bad data, so that doesn't
>> seem like a good plan.
>>
>
> Yeah, we want the detailed error info. The problem is that the hint is
> targeted to the case where we are checking data coming from the client.
> We could add another parameter to pg_verifymbstr to indicate the
> context, perhaps. I'm not sure how to do it exactly --- just a bool
> that suppresses the hint, or do we want to make a provision for some
> other hint or detail message?
>
>
>

Or instead of another param we could change the third param to be one of
(NO_ERROR, CLIENT_ERROR, SERVER_ERROR) or some such.

Or we could just add another verify func. I don't have terribly strong
opinions about it.

Incidentally, I guess we need to look at plpython and pltcl for similar
issues.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jaime Casanova 2010-01-03 23:56:38 Re: patch - per-tablespace random_page_cost/seq_page_cost
Previous Message Bruce Momjian 2010-01-03 22:43:19 Re: pg_migrator issues