Re: Change initdb default to the builtin collation provider

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Change initdb default to the builtin collation provider
Date: 2026-03-10 15:12:16
Message-ID: CA+TgmobnOn9ipEUcthLmxvv0ZKBWyCcc048qqgqy1X+tszw_Cg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 31, 2025 at 5:30 PM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> The attached patches implement a more modest proposal which does not
> conflict with Peter's objection about the display order:
>
> 0001: If the encoding is unspecified, and cannot be determined from the
> locale (i.e. the locale is C), then use UTF-8 rather than SQL_ASCII.

I don't know if this is exactly the right proposal, but I think it's
probably appropriate to start gently pushing people towards UTF-8
rather than anything else. Unicode has largely won, AFAICT, and the
use cases for anything else are increasingly narrow. I don't think we
should try to be coercive, but there's a reasonable presumption that
people who haven't said what they want probably want UTF8.

> 0002: If the provider is unspecified, and the locale is C or C.UTF-8,
> then use the builtin provider.

I'm much less convinced about this idea. I think the number of people
who will be unhappy about the less-user-friendly sort order changes is
probably quite high. It's reasonable to want something more stable and
better version-controlled than libc, but switching to a simple
code-point sort seems like a high price to pay for that.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2026-03-10 15:14:17 Re: Change initdb default to the builtin collation provider
Previous Message Nathan Bossart 2026-03-10 15:06:44 Re: another autovacuum scheduling thread