Re: SQL:2011 application time

From: Paul Jungwirth <pj(at)illuminatedcomputing(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: SQL:2011 application time
Date: 2024-03-23 17:42:47
Message-ID: d579a67c-e8d7-4eb7-ae14-f6fca01489cf@illuminatedcomputing.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

v33 attached with minor changes.

On 3/22/24 20:02, jian he wrote:
> On Fri, Mar 22, 2024 at 11:49 PM Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
>> But after reading the comment in the function about collations, I think
>> there could be trouble. As long as we are only comparing for equality
>> (and we don't support nondeterministic global collations), then we can
>> use any collation to compare for equality. But if we are doing
>> contained-by, then the collation does matter, so we would need to get
>> the actual collation somehow. So as written, this might not always work
>> correctly.
>>
>> I think it would be safer for now if we just kept using the equality
>> operation even for temporal foreign keys. If we did that, then in the
>> case that you update a key to a new value that is contained by the old
>> value, this function would say "not equal" and fire all the checks, even
>> though it wouldn't need to. This is kind of similar to the "false
>> negatives" that the comment already talks about.
>>
>> What do you think?
>>
>
> we don't need to worry about primary key and foreign key with
> different collation.
> because it will be error out as incompatible data type,
> foreign key constraint will not be created.

I agree with jian he here. Here is my own investigation:

Rangetypes themselves are never collatable (see DefineRange in commands/typecmds.c).
But rangetypes do store a collation for their base type. So you can say:

paul=# create type textrange as range (subtype = text, collation = "C");
CREATE TYPE

That is stored in pg_range.rngcollation, but pg_type.typcollation is always zero.

So putting a collection on a rangetype column is an error:

paul=# create table t (r1 textrange collate "en-US-x-icu");
ERROR: collations are not supported by type textrange

And so is using an ad hoc collation with an operator:

paul=# select '[J,J]'::textrange <@ '[a,z]'::textrange collate "en-US-x-icu";
ERROR: collations are not supported by type textrange
LINE 1: select '[J,J]'::textrange <@ '[a,z]'::textrange collate "en-...

Almost everything ranges do is built on range_cmp_bounds, which uses the base type's collation.
There is no way to use a different one.
So when ri_CompareWithCast calls `lhs <@ rhs`, it is using the collation for that range's base type.
Indexes will use the same collation.

You also can't mix different range types.
Our textrange puts (English) lowercase after uppercase:

paul=# select '[j,j]'::textrange <@ '[a,z]'::textrange;
?column?
----------
t
(1 row)

paul=# select '[J,J]'::textrange <@ '[a,z]'::textrange;
?column?
----------
f
(1 row)

We could create a rangetype that intermingles uppercase & lower:

paul=# create type itextrange as range (subtype = text, collation = "en-US-x-icu");
CREATE TYPE
paul=# select '[J,J]'::itextrange <@ '[a,z]'::itextrange;
?column?
----------
t
(1 row)

But you can't mix them:

paul=# select '[J,J]'::itextrange <@ '[a,z]'::textrange;
ERROR: operator does not exist: itextrange <@ textrange
LINE 1: select '[J,J]'::itextrange <@ '[a,z]'::textrange;
^
HINT: No operator matches the given name and argument types. You might need to add explicit type casts.

Even if I create casts, mixing still fails:

paul=# create cast (textrange as itextrange) without function as implicit;
CREATE CAST
paul=# create cast (itextrange as textrange) without function as implicit;
CREATE CAST
paul=# select '[J,J]'::itextrange <@ '[a,z]'::textrange;
ERROR: operator does not exist: itextrange <@ textrange
LINE 1: select '[J,J]'::itextrange <@ '[a,z]'::textrange;
^
HINT: No operator matches the given name and argument types. You might need to add explicit type casts.

That's because the operator parameters are anyrange, and in can_coerce_type we call
check_generic_type_consistency which doesn't use casts.
It just asks if all the concrete range types are the same (as with other polymorphic types).

Adding a foreign key runs the same check:

paul=# create table pk (id int4range, valid_at textrange, constraint pkpk primary key (id, valid_at
without overlaps));
CREATE TABLE
paul=# create table fk (id int4range, valid_at itextrange, parent_id int4range);
CREATE TABLE
paul=# alter table fk add constraint fkfk foreign key (parent_id, period valid_at) references pk;
ERROR: foreign key constraint "fkfk" cannot be implemented
DETAIL: Key columns "valid_at" and "valid_at" are of incompatible types: itextrange and textrange.

I guess the user could define their own `textrange <@ itextrange` operator, using the lhs collation.
We would choose that operator for pfeqop but not ppeqop or ffeqop.
And we use ffeqop here, which would allow us to skip a check that pfeqop would fail.
Is that an issue? It feels like the user is doing their best to get nonsense results at that point,
and it's not really about the collation per se.

Incidentally here is another separate issue with foreign keys and collations I noticed this morning:
https://www.postgresql.org/message-id/78d824e0-b21e-480d-a252-e4b84bc2c24b%40illuminatedcomputing.com
That comes from nondeterministic collations, which feel like a troublesome thing here.
Probably foreign keys just weren't fully re-thought when we added them.

But we avoid the issue from 59a85cb4 (discussion at
https://www.postgresql.org/message-id/flat/3326fc2e-bc02-d4c5-e3e5-e54da466e89a(at)2ndquadrant(dot)com)
about cascading changes when a PK experiences a not-binary-identical change that the collation
considers equal. These days we only call ri_CompareWithCast for changes on the FK side.

Now this is a long chain of reasoning to say rangetypes are safe. I added a comment. Note it doesn't
apply to arbitrary types, so if we support those eventually we should just require a recheck always,
or alternately use equals, not containedby. (That would require storing equals somewhere. It could
go in ffeqop, but that feels like a footgun since pfeqop and ppeqop need overlaps.)

On 3/21/24 22:33, jian he wrote:
> i think on update restrict, on delete restrict cannot be deferred,
> even if you set it DEFERRABLE INITIALLY DEFERRED.
> based on this idea, I made minor change on
> v32-0002-Support-multiranges-in-temporal-FKs.patch

Okay, added those tests too. Thanks!

Rebased to 697f8d266c.

Yours,

--
Paul ~{:-)
pj(at)illuminatedcomputing(dot)com

Attachment Content-Type Size
v33-0001-Add-temporal-FOREIGN-KEYs.patch text/x-patch 123.9 KB
v33-0002-Support-multiranges-in-temporal-FKs.patch text/x-patch 52.0 KB
v33-0003-Add-support-funcs-for-FOR-PORTION-OF.patch text/x-patch 43.0 KB
v33-0004-Add-UPDATE-DELETE-FOR-PORTION-OF.patch text/x-patch 160.2 KB
v33-0005-Add-CASCADE-SET-NULL-SET-DEFAULT-for-temporal-fo.patch text/x-patch 200.4 KB
v33-0006-Add-PERIODs.patch text/x-patch 283.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2024-03-23 17:57:49 Re: pg_upgrade --copy-file-range
Previous Message Paul Jungwirth 2024-03-23 17:04:04 altering a column's collation leaves an invalid foreign key