Support LIKE with nondeterministic collations

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Support LIKE with nondeterministic collations
Date: 2024-04-29 06:45:26
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

This patch adds support for using LIKE with nondeterministic collations.
So you can do things such as

col LIKE 'foo%' COLLATE case_insensitive

This currently results in a "not supported" error. The reason for that
is that when I first developed support for nondeterministic collations,
I didn't know what the semantics of that should be, especially since
with nondeterministic collations, strings of different lengths could be
equal, and then dropped the issue for a while.

After further research, the SQL standard's definition of the LIKE
predicate actually provides a clear definition of the semantics: The
pattern is partitioned into substrings at wildcard characters (so
'foo%bar' is partitioned into 'foo', '%', 'bar') and then then whole
predicate matches if a match can be found for each partition under the
applicable collation (so for 'foo%bar' we look to partition the input
string into s1 || s2 || s3 such that s1 = 'foo', s2 is anything, and s3
= 'bar'.) The only difference to deterministic collations is that for
deterministic collations we can optimize this by matching by character,
but for nondeterministic collations we have to go by substring.

Attachment Content-Type Size
v1-0001-Support-LIKE-with-nondeterministic-collations.patch text/plain 17.5 KB


Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-04-29 06:57:58 Re: A failure in prepared_xacts test
Previous Message shveta malik 2024-04-29 06:08:14 Re: Synchronizing slots from primary to standby