Re: Add CASEFOLD() function.

From: Robert Treat <rob(at)xzilla(dot)net>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org>, Joe Conway <mail(at)joeconway(dot)com>, Ian Lawrence Barwick <barwick(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add CASEFOLD() function.
Date: 2025-06-19 16:51:05
Message-ID: CABV9wwOQQ8y_+cdaH9awN1_gcoHYbncznQgPoLpbm5k+AtLR3w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 19, 2025 at 12:33 PM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>
> On Thu, 2025-06-19 at 16:36 +0100, Thom Brown wrote:
> > Ease of use, perhaps. It seems easier to use:
> >
> > column_name cftext
> >
> > rather than:
> >
> > CREATE COLLATION case_insensitive_collation (
> > PROVIDER = icu,
> > LOCALE = 'und-u-ks-level2',
> > DETERMINISTIC = FALSE
> > );
>
> We could auto-create such a collation at initdb time for ICU-enabled
> builds.
>

Providing a generic insensitive/non-deterministic collation by default
would solve a number of different use cases, so +1 on the idea from
me.
And TBH I usually build --without-icu but this would likely cause me
to change that.

> > But I see the arguments against it. It creates an unnecessary
> > dependency on an extension, and if someone wants to ignore both case
> > and accents, they may resort to using 2 extensions (citext +
> > unaccent)
> > when none are needed.
>
> There are at least three ways to do case insensitivity (or other kinds
> of equivalence):
>
> * Explicit function calls in queries, as well as index and constraint
> definitions. E.g. expression index on LOWER(), queries that explicitly
> do "LOWER(x) = ..."
>
> * Wrap those function calls up in a separate data type, like citext.
>
> * Non-deterministic collations.
>
> Given that we have collations, which are a way of organizing alternate
> behaviors for existing data types, I'm not sure I see the need for
> creating an entirely separate data type.
>
> > I guess I don't feel strongly about it either
> > way.
>
> Are you a user of citext? I'm genuinely interested in the use cases,
> and whether the separate-data-type approach has merits that are missing
> in the other approaches.
>

Yeah, I'd be interested to hear if there is some missing bit that
existing users have concerns over; as a former user of citext, it was
a great workaround at the time, but there are "better ways" to handle
those things now (imho).

Robert Treat
https://xzilla.net

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2025-06-19 16:52:47 Re: Add CASEFOLD() function.
Previous Message Jeff Davis 2025-06-19 16:33:41 Re: Add CASEFOLD() function.