Re: Enforcing database encoding and locale match

From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Enforcing database encoding and locale match
Date: 2007-09-28 20:18:50
Message-ID: 46FD61AA.6010904@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> Gregory Stark wrote:
>>> "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>>>> Another possibility is to treat the case as a WARNING if you're
>>>> superuser and an ERROR if you're not. This would satisfy people
>>>> who are uncomfortable with the idea that CREATEDB privilege comes
>>>> with a built-in denial-of-service attack, while still leaving a
>>>> loophole for anyone for whom the test didn't work properly.
>>> That sounds like a good combination
>> +1
>
> After further experimentation I want to change the proposal a bit.
> AFAICS, if we recognize the nl_langinfo(CODESET) result, there is
> no reason not to trust the answer, so we might as well throw an
> error always.

Agree. Code seems to be OK and on POSIX compatible OS it should be work.
I attached testing code. With following command

for LOCALE in `locale -a`; do ./a.out $LOCALE ; done

is should be possible to verify status on all unix OS.

On Solaris I got following problematic locales:

C ... 646 - NO MATCH
POSIX ... 646 - NO MATCH
cs ... 646 - NO MATCH
da ... 646 - NO MATCH
et ... 646 - NO MATCH
it ... 646 - NO MATCH
ja_JP.PCK ... PCK - NO MATCH
ko ... 646 - NO MATCH
no ... 646 - NO MATCH
ru ... 646 - NO MATCH
sl ... 646 - NO MATCH
sv ... 646 - NO MATCH
tr ... 646 - NO MATCH
zh.GBK ... GBK - NO MATCH
zh_CN.GB18030 ... GB18030 - NO MATCH
zh_CN(dot)GB18030(at)pinyin ... GB18030 - NO MATCH
zh_CN(dot)GB18030(at)radical ... GB18030 - NO MATCH
zh_CN(dot)GB18030(at)stroke ... GB18030 - NO MATCH
zh_CN.GBK ... GBK - NO MATCH
zh_CN(dot)GBK(at)pinyin ... GBK - NO MATCH
zh_CN(dot)GBK(at)radical ... GBK - NO MATCH
zh_CN(dot)GBK(at)stroke ... GBK - NO MATCH

> The case that is problematic is where we can get a
> CODESET string but we don't recognize it. In this case it seems
> appropriate to do
>
> ereport(WARNING,
> (errmsg("could not determine encoding for locale \"%s\": codeset is \"%s\"",
> ctype, sys),
> errdetail("Please report this to <pgsql-bugs(at)postgresql(dot)org>.")));
>
> and then let the user do what he wants.

The another question is what do when we know that this codeset/encoding
is not supported by postgres. Maybe extend encoding match structure to

struct encoding_match
{
enum pg_enc pg_enc_code;
const char *system_enc_name;
bool supported;
};

and in case when it is unsupported then generates error. In case when
codeset does not match anyway then generates only warning.

Zdenek

Attachment Content-Type Size
encoding.c text/x-csrc 3.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-09-28 20:43:04 Re: Enforcing database encoding and locale match
Previous Message Tom Lane 2007-09-28 19:31:05 Re: Enforcing database encoding and locale match