Re: C11: should we use char32_t for unicode code points?

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>
Cc: Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: C11: should we use char32_t for unicode code points?
Date: 2025-10-28 21:54:35
Message-ID: 5d5741649befc2c8a76ff3e8fc950ef50542e1ca.camel@j-davis.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2025-10-29 at 09:03 +1300, Thomas Munro wrote:
> If you try to test for the existence of the type rather than the
> header in meson/configure, won't you still have the configure-with-C
> compile-with-C++ problem

I must have misunderstood the first time. If we depend on
HAVE_CHAR32_T, then it will be set in stone in pg_config.h, and if C++
tries to include the file then it will try the typedef again and fail.

I tried with headerscheck --cplusplus before posting it, but because my
machine has uchar.h, then it didn't fail.

I went back to using the check for __cplusplus, and added a comment
that hopefully clarifies things.

I also reordered the checks so that it prefers to include uchar.h if
available, even when using C++, because that seems like the cleaner end
goal. However, that caused another problem in CI (mingw_cross_warning),
apparently due to a conflict between uchar.h and win32_port.h on that
platform:

[21:48:21.794] ../../src/include/port/win32_port.h: At top level:
[21:48:21.794] ../../src/include/port/win32_port.h:254:8: error:
redefinition of ‘struct stat’
[21:48:21.794] 254 | struct stat
/* This should match struct __stat64 */
[21:48:21.794] | ^~~~
[21:48:21.794] In file included from /usr/share/mingw-
w64/include/wchar.h:413,
[21:48:21.794] from /usr/share/mingw-
w64/include/uchar.h:28,
[21:48:21.794] from ../../src/include/c.h:526:
[21:48:21.794] /usr/share/mingw-w64/include/_mingw_stat64.h:40:10:
note: originally defined here
[21:48:21.794] 40 | struct stat {
[21:48:21.794] | ^~~~

https://cirrus-ci.com/task/4849300577976320

I could reverse the checks again and I think it will work, but let me
know if you have an idea for a better fix.

I never thought it would be so much trouble just to get a suitable type
for a UTF-32 code point...

Regards,
Jeff Davis

Attachment Content-Type Size
v4-0001-Use-C11-char16_t-and-char32_t-for-Unicode-code-po.patch text/x-patch 57.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sergey Prokhorenko 2025-10-28 21:56:32 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
Previous Message Joe Conway 2025-10-28 21:49:49 Re: contrib/sepgsql regression tests have been broken for months