Re: C11: should we use char32_t for unicode code points?

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: C11: should we use char32_t for unicode code points?
Date: 2025-10-28 18:45:18
Message-ID: 044d5476-96e9-4537-92b9-fbceb7960be5@eisentraut.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This patch looks good to me overall, it's a nice improvement in clarity.

On 26.10.25 20:43, Jeff Davis wrote:
> +/*
> + * char16_t and char32_t
> + * Unicode code points.
> + */
> +#ifndef __cplusplus
> +#ifdef HAVE_UCHAR_H
> +#include <uchar.h>
> +#ifndef __STDC_UTF_16__
> +#error "char16_t must use UTF-16 encoding"
> +#endif
> +#ifndef __STDC_UTF_32__
> +#error "char32_t must use UTF-32 encoding"
> +#endif
> +#else
> +typedef uint16_t char16_t;
> +typedef uint32_t char32_t;
> +#endif
> +#endif

This could be improved a bit. The reason for some of these conditionals
is not clear. Like, what does __cplusplus have to do with this? I
think it would be more correct to write a configure/meson check for the
actual types rather than depend indirectly on a header check.

The checks for __STDC_UTF_16__ and __STDC_UTF_32__ can be removed, as
was discussed elsewhere, since we don't use any standard library
functions that make use of these facts, and the need goes away with C23
anyway.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-10-28 18:53:26 Re: Question about InvalidatePossiblyObsoleteSlot()
Previous Message Masahiko Sawada 2025-10-28 18:44:12 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions