Re: Allow foreign keys to reference a superset of unique columns

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kaiting Chen <ktchen14(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, hellopfm(at)gmail(dot)com, Wolfgang Walther <walther(at)technowledgy(dot)de>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: Allow foreign keys to reference a superset of unique columns
Date: 2022-09-26 02:00:27
Message-ID: CAAaqYe98jrwbLJBkXnUitJX+c1F6YtOhbRmHWoLXvYhP_QFb9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Kaiting,

The use case you're looking to handle seems interesting to me.

On Wed, Jul 27, 2022 at 3:11 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Kaiting Chen <ktchen14(at)gmail(dot)com> writes:
> > I'd like to propose a change to PostgreSQL to allow the creation of a foreign
> > key constraint referencing a superset of uniquely constrained columns.
>
> TBH, I think this is a fundamentally bad idea and should be rejected
> outright. It fuzzes the semantics of the FK relationship, and I'm
> not convinced that there are legitimate use-cases. Your example
> schema could easily be dismissed as bad design that should be done
> some other way.
>
> For one example of where the semantics get fuzzy, it's not
> very clear how the extra-baggage columns ought to participate in
> CASCADE updates. Currently, if we have
> CREATE TABLE foo (a integer PRIMARY KEY, b integer);
> then an update that changes only foo.b doesn't need to update
> referencing tables, and I think we even have optimizations that
> assume that if no unique-key columns are touched then RI checks
> need not be made. But if you did
> CREATE TABLE bar (x integer, y integer,
> FOREIGN KEY (x, y) REFERENCES foo(a, b) ON UPDATE CASCADE);
> then perhaps you expect bar.y to be updated ... or maybe you don't?
>
> Another example is that I think the idea is only well-defined when
> the subset column(s) are a primary key, or at least all marked NOT NULL.
> Otherwise they're not as unique as you're claiming. But then the FK
> constraint really has to be dependent on a PK constraint not just an
> index definition, since indexes in themselves don't enforce not-nullness.
> That gets back to Peter's complaint that referring to an index isn't
> good enough.
>
> Anyway, seeing that the patch touches neither ri_triggers.c nor any
> regression tests, I find it hard to believe that such semantic
> questions have been thought through.
>
> It's also unclear to me how this ought to interact with the
> information_schema views concerning foreign keys. We generally
> feel that we don't want to present any non-SQL-compatible data
> in information_schema, for fear that it will confuse applications
> that expect to see SQL-spec behavior there. So do we leave such
> FKs out of the views altogether, or show only the columns involving
> the associated unique constraint? Neither answer seems pleasant.
>
> FWIW, the patch is currently failing to apply per the cfbot.
> I think you don't need to manually update backend/nodes/ anymore,
> but the gram.y changes look to have been sideswiped by some
> other recent commit.

As I was reading through the email chain I had this thought: could you
get the same benefit (or 90% of it anyway) by instead allowing the
creation of a uniqueness constraint that contains more columns than
the index backing it? So long as the index backing it still guaranteed
the uniqueness on a subset of columns that would seem to be safe.

Tom notes the additional columns being nullable is something to think
about. But if we go the route of allowing unique constraints with
backing indexes having a subset of columns from the constraint I don't
think the nullability is an issue since it's already the case that a
unique constraint can be declared on columns that are nullable. Indeed
it's also the case that we already support a foreign key constraint
backed by a unique constraint including nullable columns.

Because such an approach would, I believe, avoid changing the foreign
key code (or semantics) at all, I believe that would address Tom's
concerns about information_schema and fuzziness of semantics.

After writing down that idea I noticed Wolfgang Walther had commented
similarly, but it appears that that idea got lost (or at least not
responded to).

I'd be happy to sign up to review an updated patch if you're
interested in continuing this effort. If so, could you register the
patch in the CF app (if not there already)?

Thanks,
James Coleman

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-09-26 02:11:16 Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)
Previous Message Andres Freund 2022-09-26 00:38:12 Re: [RFC] building postgres with meson - v13