Re: Nondeterministic collations vs. text_pattern_ops

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Nondeterministic collations vs. text_pattern_ops
Date: 2019-09-18 15:04:13
Message-ID: 25976.1568819053@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> writes:
> Here is a draft patch.

> It will require a catversion change because those operator classes don't
> have assigned OIDs so far.

That's slightly annoying given where we are with v12. We could
avoid it by looking up the opclass's opfamily and seeing if it's
TEXT_BTREE_FAM_OID etc, which do already have hand-assigned OIDs.
But maybe avoiding a catversion bump now is not worth the cost of
an extra syscache lookup. (It'd give me an excuse to shove the
leakproofness-marking changes from the other thread into v12, so
there's that.)

Speaking of extra syscache lookups, I don't like that you rearranged
the if-test to check nondeterminism before the opclass identity checks.
That's usually going to be a wasted lookup.

> The comment block I just moved over for the time being. It should
> probably be rephrased a bit.

Indeed. Maybe like

* text_pattern_ops uses text_eq as the equality operator, which is
* fine as long as the collation is deterministic; text_eq then
* reduces to bitwise equality and so it is semantically compatible
* with the other operators and functions in the opclass. But with a
* nondeterministic collation, text_eq could yield results that are
* incompatible with the actual behavior of the index (which is
* determined by the opclass's comparison function). We prevent
* such problems by refusing creation of an index with this opclass
* and a nondeterministic collation.
*
* The same applies to varchar_pattern_ops and bpchar_pattern_ops.
* If we find more cases, we might decide to create a real mechanism
* for marking opclasses as incompatible with nondeterminism; but
* for now, this small hack suffices.
*
* Another solution is to use a special operator, not text_eq, as the
* equality opclass member, but that is undesirable because it would
* prevent index usage in many queries that work fine today.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikita Glukhov 2019-09-18 15:10:27 Fix parsing of identifiers in jsonpath
Previous Message Alvaro Herrera 2019-09-18 14:36:00 Re: pg_upgrade check fails on Solaris 10