Re: Change initdb default to the builtin collation provider

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Change initdb default to the builtin collation provider
Date: 2026-03-11 20:05:35
Message-ID: 138f54d51a4baffa3e4f80d68d814e7bf3d99716.camel@cybertec.at
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2026-03-11 at 08:47 -0400, Robert Haas wrote:
> My experience is that when I tell people they can use
> collate "C" to speed up sorting, they tell me that's a stupid
> workaround that doesn't give them the answers that they want, which
> obviously colors my viewpoint on this question in the same way that
> your experiences color yours.

That makes sense - I would be surprised if everybody were happy with
the C collation's sort order. On the other hand, I have had lots of
reports about corrupted indexes that need rebuilding (only today one
person in my course mentioned it), and I find that people don't exactly
appreciate the prospect of having to rebuild dozens of indexes after
an upgrade, when they want to keep the down time short.

My vision of a better future is like this: PostgreSQL defaults to the
C collation. People will express unhappiness about the way names
get sorted. "Easy", we tell them, "change that column's collation to
a natural language collation". They do it and are happy.

The big advantage: if you have only two or three indexes in your
database that are sorted in a collation other than C, the likelihood
for index corruption will be way lower. For example, the unique
constraint on your part number column that contains values like
'XY-1-13*' or '*P1-12_A' (which are pretty likely to be affected by
the subtle changes in libc collations) will be sorted in the C
collation, which is just fine for everybody.

This approach to collations seems to work well for Oracle users,
so why not for us?

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2026-03-11 20:17:09 Re: alter check constraint enforceability
Previous Message Zsolt Parragi 2026-03-11 19:52:35 Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?