Quick Links

C11: should we use char32_t for unicode code points?

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	C11: should we use char32_t for unicode code points?
Date:	2025-10-23 18:15:54
Message-ID:	bedcc93d06203dfd89815b10f815ca2de8626e85.camel@j-davis.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Now that we're using C11, should we use char32_t for unicode code
points?

Right now, we use pg_wchar for two purposes:

1. to abstract away some problems with wchar_t on platforms where
it's 16 bits; and
2. hold unicode code point values

In UTF8, they are are equivalent and can be freely cast back and forth,
but not necessarily in other encodings. That can be confusing in some
contexts. Attached is a patch to use char32_t for the second purpose.

Both are equivalent to uint32, so there's no functional change and no
actual typechecking, it's just for readability.

Is this helpful, or needless code churn?

Regards,
Jeff Davis

Attachment	Content-Type	Size
v1-0001-Use-C11-char32_t-for-Unicode-code-points.patch	text/x-patch	50.0 KB

Responses

Re: C11: should we use char32_t for unicode code points? at 2025-10-24 09:43:15 from Tatsuo Ishii

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Sami Imseih	2025-10-23 18:22:24	Re: another autovacuum scheduling thread
Previous Message	Matheus Alcantara	2025-10-23 18:14:12	Re: Include extension path on pg_available_extensions