RE: Allow escape in application_name

From: "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Kyotaro Horiguchi' <horikyota(dot)ntt(at)gmail(dot)com>, "masao(dot)fujii(at)oss(dot)nttdata(dot)com" <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "ikedamsh(at)oss(dot)nttdata(dot)com" <ikedamsh(at)oss(dot)nttdata(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "tgl(at)sss(dot)pgh(dot)pa(dot)us" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: RE: Allow escape in application_name
Date: 2021-10-13 11:05:19
Message-ID: TYAPR01MB5866A15E2ED07C7AC00BD77CF5B79@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Horiguchi-san, Fujii-san,

Perfect work... Thank you for replying and analyzing!

> A. "^-?[0-9]+.*" : returns valid padding. p goes after the last digit.
> B. "^[^0-9-].*" : padding = 0, p doesn't advance.
> C. "^-[^0-9].*" : padding = 0, p advances by 1 byte.
> D. "^-" : padding = 0, p advances by 1 byte.
> (if *p == 0 then breaks)

I confirmed them and your patterns are correct.

> If we wan to make the behaviors C and D same with the current, the
> else clause should be like the follows, but I don't think we need to
> do that.
> else
> {
> padding = 0;
> if (*p == '-')
> p++;
> }

This treatments is not complex so I want to add them if possible.

> One possible cause of a difference in behavior is character class
> handling including multibyte characters of isdigit and strtol. If
> isdigit accepts '一' as a digit (some platforms might do this) , and
> strtol doesn't (I believe it is universal behavior), '%一0p' is
> converted to '%' and the pointer moves onto '一'. But I don't think we
> need to do something for such a crazy specification.

Does isdigit() understand multi-byte character correctly? The arguments
of isdigit() is just a unsigned char, and this is 1byte.
Hence I thought that they cannot distinguish 'ー'.
Actually I considered about another thing. Maybe isdigit() just checks
whether the value of the argument is in (int)48 and (int)57, and that means that
the first part of some multi-byte characters may be accepted as digit in some locales.
But, of cause I agreed this is the crazy case.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2021-10-13 11:54:10 Re: [RFC] building postgres with meson
Previous Message Ajin Cherian 2021-10-13 10:59:58 Re: row filtering for logical replication