Re: Allow foreign keys to reference a superset of unique columns

From: Kaiting Chen <ktchen14(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, hellopfm(at)gmail(dot)com
Subject: Re: Allow foreign keys to reference a superset of unique columns
Date: 2022-09-27 21:35:15
Message-ID: CA+CLzG-U4Rf4mPr64hCsPMRA4bMx8PVVCCbT5+ES9M1rCDWgHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> For one example of where the semantics get fuzzy, it's not
> very clear how the extra-baggage columns ought to participate in
> CASCADE updates. Currently, if we have
> CREATE TABLE foo (a integer PRIMARY KEY, b integer);
> then an update that changes only foo.b doesn't need to update
> referencing tables, and I think we even have optimizations that
> assume that if no unique-key columns are touched then RI checks
> need not be made. But if you did
> CREATE TABLE bar (x integer, y integer,
> FOREIGN KEY (x, y) REFERENCES foo(a, b) ON UPDATE
CASCADE);
> then perhaps you expect bar.y to be updated ... or maybe you don't?

I'd expect bar.y to be updated. In my mind, the FOREIGN KEY constraint
should
behave the same, regardless of whether the underlying unique index on the
referenced side is an equivalent set to, or a strict subset of, the
referenced
columns.

> Another example is that I think the idea is only well-defined when
> the subset column(s) are a primary key, or at least all marked NOT NULL.
> Otherwise they're not as unique as you're claiming. But then the FK
> constraint really has to be dependent on a PK constraint not just an
> index definition, since indexes in themselves don't enforce not-nullness.
> That gets back to Peter's complaint that referring to an index isn't
> good enough.

I think that uniqueness should be guaranteed enough even if the subset
columns
are nullable:

CREATE TABLE foo (a integer UNIQUE, b integer);

CREATE TABLE bar (
x integer,
y integer,
FOREIGN KEY (x, y) REFERENCES foo(a, b)
);

The unique index underlying foo.a guarantees that (foo.a, foo.b) is unique
if
foo.a isn't NULL. That is, there can be multiple rows (NULL, 1) in foo.
However,
such a row can't be the target of the foreign key constraint anyway. So, I'm
fairly certain that, where it matters, a unique index on a nullable subset
of
the referenced columns guarantees a distinct referenced row.

> It's also unclear to me how this ought to interact with the
> information_schema views concerning foreign keys. We generally
> feel that we don't want to present any non-SQL-compatible data
> in information_schema, for fear that it will confuse applications
> that expect to see SQL-spec behavior there. So do we leave such
> FKs out of the views altogether, or show only the columns involving
> the associated unique constraint? Neither answer seems pleasant.

Here's the information_schema output for this example:

CREATE TABLE foo (a integer, b integer);

CREATE UNIQUE INDEX ON foo (a, b);

CREATE TABLE bar (
x integer,
y integer,
FOREIGN KEY (x, y) REFERENCES foo(a, b)
);

# SELECT * FROM information_schema.referential_constraints
WHERE constraint_name = 'bar_x_y_fkey';

-[ RECORD 1 ]-------------+----------------------------------------------
constraint_catalog | kaitingc
constraint_schema | public
constraint_name | bar_x_y_fkey
unique_constraint_catalog |
unique_constraint_schema |
unique_constraint_name |
match_option | NONE
update_rule | NO ACTION
delete_rule | NO ACTION

# SELECT * FROM information_schema.key_column_usage
WHERE constraint_name = 'bar_x_y_fkey';

-[ RECORD 173
]---------------+----------------------------------------------
constraint_catalog | kaitingc
constraint_schema | public
constraint_name | bar_x_y_fkey
table_catalog | kaitingc
table_schema | public
table_name | bar
column_name | x
ordinal_position | 1
position_in_unique_constraint | 1
-[ RECORD 174
]---------------+----------------------------------------------
constraint_catalog | kaitingc
constraint_schema | public
constraint_name | bar_x_y_fkey
table_catalog | kaitingc
table_schema | public
table_name | bar
column_name | y
ordinal_position | 2
position_in_unique_constraint | 2

It appears that currently in PostgreSQL, the unique_constraint_catalog,
schema,
and name are NULL in referential_constraints when a unique index (without an
associated unique constraint) underlies the referenced columns. The
behaviour
I'm proposing would have the same behavior vis-a-vis
referential_constraints.

As for key_column_usage, I propose that position_in_unique_constraint be
NULL if
the referenced column isn't indexed.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2022-09-27 21:45:55 Re: [PoC] Federated Authn/z with OAUTHBEARER
Previous Message Tom Lane 2022-09-27 21:29:14 Re: longfin and tamandua aren't too happy but I'm not sure why