Re: simplifying foreign key/RI checks

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Zhihong Yu <zyu(at)yugabyte(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: simplifying foreign key/RI checks
Date: 2021-01-19 06:46:28
Message-ID: CADkLM=fAojeyK0tgg3RqfJhjAcJWxF8kS1uxObGkQXjTuSQTig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 18, 2021 at 9:45 PM Amit Langote <amitlangote09(at)gmail(dot)com>
wrote:

> On Tue, Jan 19, 2021 at 2:47 AM Zhihong Yu <zyu(at)yugabyte(dot)com> wrote:
> >
> > Hi,
> > I was looking at this statement:
> >
> > insert into f select generate_series(1, 2000000, 2);
> >
> > Since certain generated values (the second half) are not in table p,
> wouldn't insertion for those values fail ?
> > I tried a scaled down version (1000th) of your example:
> >
> > yugabyte=# insert into f select generate_series(1, 2000, 2);
> > ERROR: insert or update on table "f" violates foreign key constraint
> "f_a_fkey"
> > DETAIL: Key (a)=(1001) is not present in table "p".
>
> Sorry, a wrong copy-paste by me. Try this:
>
> create table p (a numeric primary key);
> insert into p select generate_series(1, 2000000);
> create table f (a bigint references p);
>
> -- Unpatched
> insert into f select generate_series(1, 2000000, 2);
> INSERT 0 1000000
> Time: 6527.652 ms (00:06.528)
>
> update f set a = a + 1;
> UPDATE 1000000
> Time: 8108.310 ms (00:08.108)
>
> -- Patched:
> insert into f select generate_series(1, 2000000, 2);
> INSERT 0 1000000
> Time: 3312.193 ms (00:03.312)
>
> update f set a = a + 1;
> UPDATE 1000000
> Time: 4292.807 ms (00:04.293)
>
> > For v1-0002-Avoid-using-SPI-for-some-RI-checks.patch :
> >
> > + * Collect partition key values from the unique key.
> >
> > At the end of the nested loop, should there be an assertion that
> partkey->partnatts partition key values have been found ?
> > This can be done by using a counter (initialized to 0) which is
> incremented when a match is found by the inner loop.
>
> I've updated the patch to add the Assert. Thanks for taking a look.
>
> --
> Amit Langote
> EDB: http://www.enterprisedb.com

v2 patch applies and passes make check and make check-world. Perhaps, given
the missing break at line 418 without any tests failing, we could add
another regression test if we're into 100% code path coverage. As it is, I
think the compiler warning was a sufficient alert.

The code is easy to read, and the comments touch on the major points of
what complexities arise from partitioned tables.

A somewhat pedantic complaint I have brought up off-list is that this patch
continues the pattern of the variable and function names making the
assumption that the foreign key is referencing the primary key of the
referenced table. Foreign key constraints need only reference a unique
index, it doesn't have to be the primary key. Granted, that unique index is
behaving exactly as a primary key would, so conceptually it is very
similar, but keeping with the existing naming (pk_rel, pk_type, etc) can
lead a developer to think that it would be just as correct to find the
referenced relation and get the primary key index from there, which would
not always be correct. This patch correctly grabs the index from the
constraint itself, so no problem there.

I like that this patch changes the absolute minimum of the code in order to
get a very significant performance benefit. It does so in a way that should
reduce resource pressure found in other places [1]. This will in turn
reduce the performance penalty of "doing the right thing" in terms of
defining enforced foreign keys. It seems to get a clearer performance boost
than was achieved with previous efforts at statement level triggers.

This patch completely sidesteps the DELETE case, which has more insidious
performance implications, but is also far less common, and whose solution
will likely be very different.

[1]
https://www.postgresql.org/message-id/CAKkQ508Z6r5e3jdqhfPWSzSajLpHo3OYYOAmfeSAuPTo5VGfgw@mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-01-19 06:48:08 Re: ResourceOwner refactoring
Previous Message Craig Ringer 2021-01-19 06:42:31 Re: [PATCH] ProcessInterrupts_hook