From: | Thom Brown <thom(at)linux(dot)com> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | Peter Eisentraut <peter(at)eisentraut(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org>, Joe Conway <mail(at)joeconway(dot)com>, Ian Lawrence Barwick <barwick(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Add CASEFOLD() function. |
Date: | 2025-06-19 16:59:08 |
Message-ID: | CAA-aLv5Se9zt3CxcGWYwJUA-0nnx+sAArwUWRJKTwTL1a=8YyA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 19 Jun 2025, 17:33 Jeff Davis, <pgsql(at)j-davis(dot)com> wrote:
> On Thu, 2025-06-19 at 16:36 +0100, Thom Brown wrote:
> > Ease of use, perhaps. It seems easier to use:
> >
> > column_name cftext
> >
> > rather than:
> >
> > CREATE COLLATION case_insensitive_collation (
> > PROVIDER = icu,
> > LOCALE = 'und-u-ks-level2',
> > DETERMINISTIC = FALSE
> > );
>
> We could auto-create such a collation at initdb time for ICU-enabled
> builds.
>
> > But I see the arguments against it. It creates an unnecessary
> > dependency on an extension, and if someone wants to ignore both case
> > and accents, they may resort to using 2 extensions (citext +
> > unaccent)
> > when none are needed.
>
> There are at least three ways to do case insensitivity (or other kinds
> of equivalence):
>
> * Explicit function calls in queries, as well as index and constraint
> definitions. E.g. expression index on LOWER(), queries that explicitly
> do "LOWER(x) = ..."
>
> * Wrap those function calls up in a separate data type, like citext.
>
> * Non-deterministic collations.
>
> Given that we have collations, which are a way of organizing alternate
> behaviors for existing data types, I'm not sure I see the need for
> creating an entirely separate data type.
>
> > I guess I don't feel strongly about it either
> > way.
>
> Are you a user of citext? I'm genuinely interested in the use cases,
> and whether the separate-data-type approach has merits that are missing
> in the other approaches.
>
No. But given the options, I would personally choose nondeterministic
collations now that they are available. I just wish they were more
user-friendly as I suspect the majority of people either won't know about
them, or won't know how to use them. But like you say, maybe having a set
of predefined nd-collections would help. As it stands, I'm just bringing up
the consideration of citext in case it has any value, which it doesn't
appear to. In fact it's probably even an argument to begin the process of
deprecation.
Thom
>
From | Date | Subject | |
---|---|---|---|
Next Message | David E. Wheeler | 2025-06-19 17:38:54 | Re: Add CASEFOLD() function. |
Previous Message | Jeff Davis | 2025-06-19 16:52:47 | Re: Add CASEFOLD() function. |