RE: [PGdocs] fix description for handling pf non-ASCII characters

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'jian he' <jian(dot)universality(at)gmail(dot)com>
Cc: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: [PGdocs] fix description for handling pf non-ASCII characters
Date: 2023-06-29 07:51:49
Message-ID: TYAPR01MB58662BC412E1290FC3348973F525A@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Jian,

Thank you for checking my patch!

>
> in your patch:
> > printable ASCII characters will be replaced with a hex escape.
>
> My wording is not good. I think the result will be: ASCII characters
> will be as is, non-ASCII characters will be replaced with "a hex
> escape".

Yeah, your point was right. I have already said:
"anything other than printable ASCII characters will be replaced with a hex escape"
IIUC They have same meaning.

You might want to say the line was not good, so reworded like
"non-ASCII characters will be replaced with hexadecimal strings." How do you think?

> set application_name to 'abc漢字Abc';
> SET
> test16=# show application_name;
> application_name
> --------------------------------
> abc\xe6\xbc\xa2\xe5\xad\x97Abc
> (1 row)
>
> I see multi escape, so I am not sure "a hex escape".

Not sure what you said, but I could not find word "hex escape" in the document.
So I used "hexadecimal string" instead. Is it acceptable?

> to properly render it back to 'abc漢字Abc'
> here is how i do it:
> select 'abc' || convert_from(decode(' e6bca2e5ad97','hex'), 'UTF8') || 'Abc';

Yeah, your approach seems right, but I'm not sure it is related with us.
Just to confirm, I don't have interest the method for rendering non-ASCII characters.
My motivation of the patch was to document the the incompatibility noted in [1]:

>
Changed the conversion rules when non-ASCII characters are specified for ASCII-only
strings such as parameters application_name and cluster_name. Previously, it was
converted in byte units with a question mark (?), but in PostgreSQL 16, it is
converted to a hexadecimal string.
>

> I guess it's still painful if your application_name has non-ASCII chars.

I agreed that, but no one has recommended to use non-ASCII.

[1]: https://h50146.www5.hpe.com/products/software/oe/linux/mainstream/support/lcc/pdf/PostgreSQL16Beta1_New_Features_en_20230528_1.pdf

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
v2_doc_fix.patch application/octet-stream 2.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2023-06-29 08:43:12 Re: Do we want a hashset type?
Previous Message John Morris 2023-06-29 07:50:17 Unified File API