[PATCH] Partial foreign key updates in referential integrity triggers

From: Paul Martinez <paulmtz(at)google(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: [PATCH] Partial foreign key updates in referential integrity triggers
Date: 2021-01-05 21:40:38
Message-ID: CACqFVBZQyMYJV=njbSMxf+rbDHpx=W=B7AEaMKn8dWn9OZJY7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hey, hackers!

I've created a patch to better support referential integrity constraints when
using composite primary and foreign keys. This patch allows creating a foreign
key using the syntax:

FOREIGN KEY (tenant_id, fk_id) REFERENCES fktable ON DELETE SET NULL (fk_id)

which means that only the fk_id column will be set to NULL when the referenced
row is deleted, rather than both the tenant_id and fk_id columns.

In multi-tenant applications, it is common to denormalize a "tenant_id" column
across every table, and use composite primary keys of the form (tenant_id, id)
and composite foreign keys of the form (tenant_id, fk_id), reusing the
tenant_id column in both the primary and foreign key.

This is often done initially for performance reasons, but has the added benefit
of making it impossible for data from one tenant to reference data from another
tenant, also making this a good decision from a security perspective.

Unfortunately, one downside of using composite foreign keys in such a matter
is that commonly used referential actions, such as ON DELETE SET NULL, no
longer work, because Postgres tries to set all of the referencing columns to
NULL, including the columns that overlap with the primary key:

CREATE TABLE tenants (id serial PRIMARY KEY);
CREATE TABLE users (
tenant_id int REFERENCES tenants ON DELETE CASCADE,
id serial,
PRIMARY KEY (tenant_id, id),
);
CREATE TABLE posts (
tenant_id int REFERENCES tenants ON DELETE CASCADE,
id serial,
author_id int,
PRIMARY KEY (tenant_id, id),
FOREIGN KEY (tenant_id, author_id) REFERENCES users ON DELETE SET NULL
);

INSERT INTO tenants VALUES (1);
INSERT INTO users VALUES (1, 101);
INSERT INTO posts VALUES (1, 201, 101);
DELETE FROM users WHERE id = 101;
ERROR: null value in column "tenant_id" violates not-null constraint
DETAIL: Failing row contains (null, 201, null).

This can be resolved by manually creating triggers on the referenced table, but
this is clunky and adds a significant amount of extra work when adding (or
removing!) foreign keys. Users shouldn't have to compromise on maintenance
overhead when using composite foreign keys.

I have implemented a simple extension to the syntax for foreign keys that
makes it just as easy to support referential integrity constraints for
composite foreign keys as it is for single column foreign keys. The SET NULL
and SET DEFAULT referential actions can now be optionally followed by a column
list:

key_action:
| NO ACTION
| RESTRICT
| CASCADE
| SET NULL [ (column_name [, ...] ) ]
| SET DEFAULT [ (column_name [, ...] ) ]

When a column list is provided, only the specified columns, which must be a
subset of the referencing columns, will be updated in the associated trigger.

Note that use of SET NULL (col1, ...) on a composite foreign key with MATCH
FULL specified will still raise an error. In such a scenario we could raise an
error when the user tries to define the foreign key, but we don't raise a
similar error when a user tries to use SET NULL on a non-nullable column, so I
don't think this is critical. (I haven't added this check in my patch.) While
this feature would mostly be used with the default MATCH SIMPLE, I could
imagine using SET DEFAULT (col, ...) even for MATCH FULL foreign keys.

To store this additional data, I've added two columns to the pg_constraint
catalog entry:

confupdsetcols int2[]
confdelsetcols int2[]

These columns store the attribute numbers of the columns to update in the
ON UPDATE and ON DELETE triggers respectively. If the arrays are empty, then
all of the referencing columns should be updated.

I previously proposed this feature about a year ago [1], but I don't feel like
the arguments against it were very strong. Wanting to get more familiar with the
Postgres codebase I decided to implement the feature over this holiday break,
and I've gotten everything working and put together a complete patch including
tests and updates to documentation. Hopefully if people find it useful it can
make its way into the next commitfest!

Visual diff:
https://github.com/postgres/postgres/compare/master...PaulJuliusMartinez:on-upd-del-set-cols

Here's a rough outline of the changes:

src/backend/parser/gram.y | 122
src/include/nodes/parsenodes.h | 3
src/backend/nodes/copyfuncs.c | 2
src/backend/nodes/equalfuncs.c | 2
src/backend/nodes/outfuncs.c | 47
- Modify grammar to add opt_column_list after SET NULL and SET DEFAULT
- Add fk_{upd,del}_set_cols fields to Constraint struct
- Add proper node support, as well as outfuncs support for AlterTableStmt,
which I used while debugging

src/include/catalog/pg_constraint.h | 20
src/backend/catalog/pg_constraint.c | 80
src/include/catalog/catversion.h | 2
- Add confupdsetcols and confdelsetcols to pg_constraint catalog entry

src/backend/commands/tablecmds.c | 142
- Pass along data from parsed Constraint node to CreateConstraintEntry
- Handle propagating constraints for partitioned tables

src/backend/utils/adt/ri_triggers.c | 109
- Update construction of trigger query to only update specified columns
- Update caching mechanism since ON UPDATE and ON DELETE may modify different
sets of columns

src/backend/utils/adt/ruleutils.c | 29
- Update pg_get_constraintdef to handle new syntax
- This automatically updates psql \d output as well as pg_dump

src/backend/catalog/heap.c | 4
src/backend/catalog/index.c | 4
src/backend/commands/trigger.c | 4
src/backend/commands/typecmds.c | 4
src/backend/utils/cache/relcache.c | 2
- Update misc. call sites for updated functions

src/test/regress/sql/foreign_key.sql | 63
src/test/regress/expected/foreign_key.out | 88
- Regression tests with checks for error cases and partitioned tables

doc/src/sgml/catalogs.sgml | 24
doc/src/sgml/ref/create_table.sgml | 15
- Updated documentation for CREATE TABLE and pg_constraint catalog definition

20 files changed, 700 insertions(+), 66 deletions(-)

Thanks,
Paul

[1] https://www.postgresql.org/message-id/flat/CAF%2B2_SGRXQOtumethpuXhsyU%2B4AYzfKA5fhHCjCjH%2BjQ04WWjA%40mail.gmail.com

Attachment Content-Type Size
referential-actions-set-cols-v1.patch application/octet-stream 54.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-01-05 21:40:46 Re: plpgsql variable assignment with union is broken
Previous Message Tom Lane 2021-01-05 21:34:52 Re: Libpq support to connect to standby server as priority