Nondeterministic collations vs. text_pattern_ops

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Nondeterministic collations vs. text_pattern_ops
Date: 2019-09-16 23:13:39
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Whilst poking at the leakproofness-of-texteq issue, I realized
that there's an independent problem caused by the nondeterminism
patch. To wit, that the text_pattern_ops btree opclass uses
texteq as its equality operator, even though that operator is
no longer guaranteed to be bitwise equality. That means that
depending on which collation happens to get attached to the
operator, equality might be inconsistent with the other members
of the opclass, leading to who-knows-what bad results.

bpchar_pattern_ops has the same issue with respect to bpchareq.

The obvious fix for this is to invent separate new equality operators,
but that's actually rather disastrous for performance, because
text_pattern_ops indexes would no longer be able to use WHERE clauses
using plain equality. That also feeds into whether equality clauses
deduced from equivalence classes will work for them (nope, not any
more). People using such indexes are just about certain to be
bitterly unhappy.

We may not have any choice but to do that, though --- I sure don't
see any other easy fix. If we could be certain that the collation
attached to the operator is deterministic, then it would still work
with a pattern_ops index, but that's not a concept that the index
infrastructure has got right now.

Whatever we do about this is likely to require a catversion bump,
meaning we've got to fix it *now*.

regards, tom lane


Browse pgsql-hackers by date

  From Date Subject
Next Message Nikita Glukhov 2019-09-16 23:48:12 Re: SQL/JSON: JSON_TABLE
Previous Message Chapman Flack 2019-09-16 23:11:13 Re: Define jsonpath functions as stable