Re: ExecBuildGroupingEqual versus collations

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: ExecBuildGroupingEqual versus collations
Date: 2018-12-15 17:01:20
Message-ID: 15301.1544893280@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
>> On 2018-12-14 14:25:30 -0500, Tom Lane wrote:
>>> Now, it's certainly true that nameeq() doesn't need a collation spec
>>> today, any more than texteq() does, because both types legislate that
>>> equality is bitwise. But if we leave ExecBuildGroupingEqual like this,
>>> we're mandating that no type anywhere, ever, can have a
>>> collation-dependent notion of equality. Is that really a restriction
>>> we're comfortable with? citext is sort of the poster child here,
>>> because it already wishes it could have collation-dependent equality.

>> Don't we rely on that in other places too?

> Possibly; the regression tests probably don't stress type "name" too hard,
> so my Assert might well not be finding some other code paths that do
> likewise.

To investigate this a bit more, I added a similar Assert in texteq(),
and also grepped for places that were using FunctionCall2 to call
equality functions. The attached patch shows what I had to change
to get check-world to pass. It's possible that there are one or two
more places I missed --- two of these changes were found by grepping
but were *not* exposed by regression testing. But this ought to be
a pretty good indication of the scope of the problem.

There are a couple of places touched here that know they are invoking
texteq specifically, so we wouldn't really need to change them to have
a working feature. In all I found six places that we'd need to deal with
if we want to support collation-dependent equality.

A possible medium-term compromise is to pass DEFAULT_COLLATION_OID,
as I've done here, so that at least a collation-examining equality
function won't fall over. A variant of that is to pass C_COLLATION_OID,
which presumably would be faster to execute in many cases; it would
also have the advantage of being database-independent, which might
be a good thing in execReplication.c for instance.

Or we could just decide that collation-dependent equality is a bridge
too far for today. I'm not hugely excited about pursuing it myself,
I just wanted to get a handle on how badly broken it is.

regards, tom lane

Attachment Content-Type Size
equality-with-collation.patch text/x-diff 9.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-12-15 17:35:09 Improving collation-dependent indexes in system catalogs
Previous Message Tom Lane 2018-12-15 15:45:21 Re: Variable-length FunctionCallInfoData