Re: [PATCH] Fix out-of-bouds access (src/common/wchar.c)

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: ranier(dot)vf(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Fix out-of-bouds access (src/common/wchar.c)
Date: 2022-02-17 06:51:26
Message-ID: 20220217.155126.649675036261400698.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 17 Feb 2022 14:58:38 +0900 (JST), Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> (Sorry for the broken mail...)
>
> At Wed, 16 Feb 2022 09:29:20 -0300, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com> wrote in
> > > The patch:
> > > pg_encoding_max_length(int encoding)
> > > {
> > > - Assert(PG_VALID_ENCODING(encoding));
> > > -
> > > - return pg_wchar_table[encoding].maxmblen;
> > > + if (PG_VALID_ENCODING(encoding))
> > > + return pg_wchar_table[encoding].maxmblen;
> > > + else
> > > + return -1;
> > >
> > > Returning -1 for invalid encoding is, I think, flat wrong.
> > >
> > Ok, if -1 is wrong, what should the value of return if
> > somebody calling this function like:
> > pg_encoding_max_length(63);
>
> Should result in assertion failure, I think. If that fails, the
> caller side is anyhow broken. On the other hand we haven't had a
> complain about that, maybe.
>
> > Of course, with patch applied, because with original code
> > has memory corruption, if built and run without DEBUG.
>
> So we don't assume corruption in production build. It should be
> logically guaranteed.
>
> I'll dig into that further.

The number comes from pg_char_to_encoding, which is the internal
function of the SQL function PG_char_encoding(). It looks fine but I
can confirm that for all possible encoding names in pg_encname_tbl[].

=# select * from (
select e as name, PG_char_to_encoding(e) enc
from unnest(array['abc', 'alt', 'big5', 'euccn', 'eucjis2004', 'eucjp',
'euckr', 'euctw', 'gb18030', 'gbk', 'iso88591', 'iso885910', 'iso885913',
'iso885914', 'iso885915', 'iso885916', 'iso88592', 'iso88593', 'iso88594',
'iso88595', 'iso88596', 'iso88597', 'iso88598', 'iso88599', 'johab',
'koi8', 'koi8r', 'koi8u', 'latin1', 'latin10', 'latin2', 'latin3',
'latin4', 'latin5', 'latin6', 'latin7', 'latin8', 'latin9', 'mskanji',
'muleinternal', 'shiftjis', 'shiftjis2004', 'sjis', 'sqlascii', 'tcvn',
'tcvn5712', 'uhc', 'unicode', 'utf8', 'vscii', 'win', 'win1250', 'win1251',
'win1252', 'win1253', 'win1254', 'win1255', 'win1256', 'win1257',
'win1258', 'win866', 'win874', 'win932', 'win936', 'win949', 'win950',
'windows1250', 'windows1251', 'windows1252', 'windows1253', 'windows1254',
'windows1255', 'windows1256', 'windows1257', 'windows1258', 'windows866',
'windows874', 'windows932', 'windows936', 'windows949', 'windows950',
'hoge']) as e) as t where enc < 0 or enc > 41;
name | enc
------+-----
hoge | -1
(1 row)

So, the function doesn't return 63 for all registered names and wrong
names.

So other possibilities I can think of are..

- Someone had broken pg_encname_tbl[]

- Cosmic ray hit, or ill memory cell.

- Coverity worked wrong way.

Could you show the workload for the Coverity warning here?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-02-17 06:52:08 Re: Time to drop plpython2?
Previous Message Amit Kapila 2022-02-17 06:36:51 Re: row filtering for logical replication