Re: Change initdb default to the builtin collation provider

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Change initdb default to the builtin collation provider
Date: 2025-10-31 21:30:19
Message-ID: 47e1b4f72fe732c5ae85c6cf2c085b4e99a10120.camel@j-davis.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2025-10-10 at 17:48 -0700, Jeff Davis wrote:
> -------
> Summary
> -------
>
> The libc collation provider is a bad default[1]. The builtin
> collation
> provider is a good default, so let's use that.

The attached patches implement a more modest proposal which does not
conflict with Peter's objection about the display order:

0001: If the encoding is unspecified, and cannot be determined from the
locale (i.e. the locale is C), then use UTF-8 rather than SQL_ASCII.

0002: If the provider is unspecified, and the locale is C or C.UTF-8,
then use the builtin provider.

Motivation:

* UTF-8 seems safer than SQL_ASCII when the locale is compatible with
either.

* Whether the "C" locale uses the builtin provider or the libc provider
is mostly about the catalog representation, because the implementation
is the same. I don't have a strong motivation for this change, it just
clarifies that libc is not actually being used when the locale is "C".

* I think most users of the "C.UTF-8" locale would be better off with
the builtin provider, which benefits from important optimizations.

Note:

This would mean that "initdb --no-locale" would select UTF-8 and the
builtin provider with locale "C", whereas previously it would have
selected SQL_ASCII and the libc provider (though it didn't ever really
use libc internally). I'm not sure if others want this behavior or if
it would be surprising.

Regards,
Jeff Davis

Attachment Content-Type Size
v1-0001-initdb-prefer-UTF-8-encoding-over-SQL_ASCII.patch text/x-patch 1.0 KB
v1-0002-initdb-if-locale-is-C-or-C.UTF-8-use-builtin-prov.patch text/x-patch 2.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-10-31 21:38:34 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Previous Message Alastair Turner 2025-10-31 21:17:41 Re: pg_plan_advice