Re: ExecBuildGroupingEqual versus collations

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: ExecBuildGroupingEqual versus collations
Date: 2018-12-14 20:43:06
Message-ID: 20181214204306.764jihyvrumiftkr@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2018-12-14 14:25:30 -0500, Tom Lane wrote:
> In pursuit of places that might not be on board with non-default
> collations, I tried sticking
>
> Assert(PG_GET_COLLATION() == C_COLLATION_OID);
>
> into nameeq() and the other name comparison functions, in my build with
> type name's typcollation changed to "C". I'm down to one place in the
> core regression tests that falls over because of that:
> ExecBuildGroupingEqual() figures it can get away with
>
> InitFunctionCallInfoData(*fcinfo, finfo, 2,
> InvalidOid, NULL, NULL);
>
> i.e. passing collation = InvalidOid to the type-specific equality
> functions.

Yea, I just cargo-culted that behaviour forward, the code previously
did:

bool
execTuplesMatch(TupleTableSlot *slot1,
TupleTableSlot *slot2,
int numCols,
AttrNumber *matchColIdx,
FmgrInfo *eqfunctions,
MemoryContext evalContext)
...
/* Apply the type-specific equality function */

if (!DatumGetBool(FunctionCall2(&eqfunctions[i],
attr1, attr2)))
{
result = false; /* they aren't equal */
break;
}

> Now, it's certainly true that nameeq() doesn't need a collation spec
> today, any more than texteq() does, because both types legislate that
> equality is bitwise. But if we leave ExecBuildGroupingEqual like this,
> we're mandating that no type anywhere, ever, can have a
> collation-dependent notion of equality. Is that really a restriction
> we're comfortable with? citext is sort of the poster child here,
> because it already wishes it could have collation-dependent equality.

Don't we rely on that in other places too?

> We'd have to do some work to pass a reasonable collation choice through
> to this code from wherever we have the info available, but I don't
> think it'd be a huge amount of work.

Yea, I think it shouldn't be too hard for most callers.

I think one issue here is that it's not clear how it'd be sensible to
have collation dependent equality for cases where we don't have a real
way to override the collation for an operation. It's not too hard to
imagine adding something to GROUP BY, but set operations seem harder.

> BTW, this case is perhaps a bit of a counterexample to Peter's idea about
> doing collation lookups earlier: if we change this, we'd presumably have
> to do collation lookup right here, even though we know very well that
> common types' equality functions won't use the information.

Hm.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Isaac Morland 2018-12-14 20:51:05 Re: 'infinity'::Interval should be added
Previous Message Andrew Gierth 2018-12-14 20:37:46 Re: Ryu floating point output patch