Re: TM format can mix encodings in to_char()

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: juanjo(dot)santamaria(at)gmail(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: TM format can mix encodings in to_char()
Date: 2019-04-19 08:30:17
Message-ID: 20190419.173017.204258244.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Fri, 12 Apr 2019 18:45:51 +0200, Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com> wrote in <CAC+AXB22So5aZm2vZe+MChYXec7gWfr-n-SK-iO091R0P_1Tew(at)mail(dot)gmail(dot)com>
> Hackers,
>
> I will use as an example the code in the regression test
> 'collate.linux.utf8'.
> There you can find:
>
> SET lc_time TO 'tr_TR';
> SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
> to_char
> -------------
> 01 NIS 2010
> (1 row)
>
> The problem is that the locale 'tr_TR' uses the encoding ISO-8859-9
> (LATIN5),
> while the test runs in UTF8. So the following code will raise an error:
>
> SET lc_time TO 'tr_TR';
> SELECT to_char(date '2010-02-01', 'DD TMMON YYYY');
> ERROR: invalid byte sequence for encoding "UTF8": 0xde 0x75

The same case is handled for lc_numeric. lc_time ought to be
treated the same way.

> The problem seems to be in the code touched in the attached patch.

It seems basically correct, but cache_single_time does extra
strdup when pg_any_to_server did conversion. Maybe it would be
better be like this:

> oldcxt = MemoryContextSwitchTo(TopMemoryContext);
> ptr = pg_any_to_server(buf, strlen(buf), encoding);
>
> if (ptr == buf)
> {
> /* Conversion didn't pstrdup, so we must */
> ptr = pstrdup(buf);
> }
> MemoryContextSwitchTo(oldcxt);

- int i;
+ int i,
+ encoding;

It is not strictly kept, but (I believe) we don't define multiple
variables in a single definition.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2019-04-19 09:16:13 Re: Unhappy about API changes in the no-fsm-for-small-rels patch
Previous Message Amit Langote 2019-04-19 08:13:43 Re: Runtime pruning problem