Re: Wrong results with equality search using trigram index and non-deterministic collation

From: Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: David Geier <geidav(dot)pg(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Wrong results with equality search using trigram index and non-deterministic collation
Date: 2026-05-07 19:00:04
Message-ID: CAN4CZFNTeOU-omwNvgVrgGZhe1MTFXQXOv-28S4-GEjO5ytV7w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

> Does that mean that you could end up with wrong results (which would not
> be acceptable), or that you could end up with false positives that
> later get eliminated by the recheck (which would be fine)?

+ /*
+ * For non-C collations, extract the three bytes from each trigram
+ * and compare them using the collation's comparison function.
+ */

...

+ /* Use collation-aware comparison */
+ result = pg_strncoll(str_a, 3, str_b, 3, locale);
+ PG_RETURN_INT32(result);

For non-C collations, isn't the trigram likely a hash rather than a
proper string, where pg_strncoll won't work properly?

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexandra Wang 2026-05-07 19:12:06 Re: Remove inner joins based on foreign keys
Previous Message Dmitry Dolgov 2026-05-07 18:10:30 Re: Randomize B-Tree page split location to avoid oscillating patterns