Quick Links

Re: Explicit deterministic COLLATE fails with pattern matching operations on column with non-deterministic collation

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	James Lucas <jlucasdba(at)gmail(dot)com>
Cc:	"David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject:	Re: Explicit deterministic COLLATE fails with pattern matching operations on column with non-deterministic collation
Date:	2020-05-28 18:29:28
Message-ID:	7417.1590690568@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

James Lucas <jlucasdba(at)gmail(dot)com> writes:
> I tried setting up a pathological test case for this, and it seems
> like at least currently, even with a non-deterministic collation
> statistics still count values as distinct, even if the default
> collation would consider them equivalent. Not sure if that's as
> intended or not?

I experimented with this, and what I'm seeing is that ucol_strcollUTF8()
reports that 'aaa' is different from 'aAa'. So the behavior on the
Postgres side is as-expected. I suspect that the 'en-US-ks-level2'
ICU locale doesn't act as you think it does. (That is, just saying
that a collation is nondeterministic doesn't make it so; it only forces
Postgres through slower code paths that allow for the possibility of
bitwise-unequal strings being reported as equal by ICU.) Not knowing
anything about ICU, I can't say more than that.

[ Tested on libicu-60.3-2.el8_1 ]

regards, tom lane

In response to

Re: Explicit deterministic COLLATE fails with pattern matching operations on column with non-deterministic collation at 2020-05-28 18:04:06 from James Lucas

Responses

Re: Explicit deterministic COLLATE fails with pattern matching operations on column with non-deterministic collation at 2020-05-28 18:48:38 from Daniel Verite

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Daniel Verite	2020-05-28 18:48:38	Re: Explicit deterministic COLLATE fails with pattern matching operations on column with non-deterministic collation
Previous Message	James Lucas	2020-05-28 18:04:06	Re: Explicit deterministic COLLATE fails with pattern matching operations on column with non-deterministic collation