Re: More efficient RI checks - take 2

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: More efficient RI checks - take 2
Date: 2020-04-08 17:05:45
Message-ID: CAFj8pRAGtHrJ4YKUpC5YNWrLyyTsxjfucDp5f7tx6CS7po1xDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

st 8. 4. 2020 v 18:36 odesílatel Antonin Houska <ah(at)cybertec(dot)at> napsal:

> After having reviewed [1] more than a year ago (the problem I found was
> that
> the transient table is not available for deferred constraints), I've tried
> to
> implement the same in an alternative way. The RI triggers still work as row
> level triggers, but if multiple events of the same kind appear in the
> queue,
> they are all passed to the trigger function at once. Thus the check query
> does
> not have to be executed that frequently.
>
> Some performance comparisons are below. (Besides the execution time, please
> note the difference in the number of trigger function executions.) In
> general,
> the checks are significantly faster if there are many rows to process, and
> a
> bit slower when we only need to check a single row. However I'm not sure
> about
> the accuracy if only a single row is measured (if a single row check is
> performed several times, the execution time appears to fluctuate).
>

It is hard task to choose good strategy for immediate constraints, but for
deferred constraints you know how much rows should be checked, and then you
can choose better strategy.

Is possible to use estimation for choosing method of RI checks?

> Comments are welcome.
>
> Setup
> =====
>
> CREATE TABLE p(i int primary key);
> INSERT INTO p SELECT x FROM generate_series(1, 16384) g(x);
> CREATE TABLE f(i int REFERENCES p);
>
>
> Insert many rows into the FK table
> ==================================
>
> master:
>
> EXPLAIN ANALYZE INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);
> QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------------------------
> Insert on f (cost=0.00..163.84 rows=16384 width=4) (actual
> time=32.741..32.741 rows=0 loops=1)
> -> Function Scan on generate_series g (cost=0.00..163.84 rows=16384
> width=4) (actual time=2.403..4.802 rows=16384 loops=1)
> Planning Time: 0.050 ms
> Trigger for constraint f_i_fkey: time=448.986 calls=16384
> Execution Time: 485.444 ms
> (5 rows)
>
> patched:
>
> EXPLAIN ANALYZE INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);
> QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------------------------
> Insert on f (cost=0.00..163.84 rows=16384 width=4) (actual
> time=34.053..34.053 rows=0 loops=1)
> -> Function Scan on generate_series g (cost=0.00..163.84 rows=16384
> width=4) (actual time=2.223..4.448 rows=16384 loops=1)
> Planning Time: 0.047 ms
> Trigger for constraint f_i_fkey: time=105.164 calls=8
> Execution Time: 141.201 ms
>
>
> Insert a single row into the FK table
> =====================================
>
> master:
>
> EXPLAIN ANALYZE INSERT INTO f VALUES (1);
> QUERY PLAN
>
> ------------------------------------------------------------------------------------------
> Insert on f (cost=0.00..0.01 rows=1 width=4) (actual time=0.060..0.060
> rows=0 loops=1)
> -> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.002
> rows=1 loops=1)
> Planning Time: 0.026 ms
> Trigger for constraint f_i_fkey: time=0.435 calls=1
> Execution Time: 0.517 ms
> (5 rows)
>
> patched:
>
> EXPLAIN ANALYZE INSERT INTO f VALUES (1);
> QUERY PLAN
>
> ------------------------------------------------------------------------------------------
> Insert on f (cost=0.00..0.01 rows=1 width=4) (actual time=0.066..0.066
> rows=0 loops=1)
> -> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.002
> rows=1 loops=1)
> Planning Time: 0.025 ms
> Trigger for constraint f_i_fkey: time=0.578 calls=1
> Execution Time: 0.670 ms
>
>
> Check if FK row exists during deletion from the PK
> ==================================================
>
> master:
>
> DELETE FROM p WHERE i=16384;
> ERROR: update or delete on table "p" violates foreign key constraint
> "f_i_fkey" on table "f"
> DETAIL: Key (i)=(16384) is still referenced from table "f".
> Time: 3.381 ms
>
> patched:
>
> DELETE FROM p WHERE i=16384;
> ERROR: update or delete on table "p" violates foreign key constraint
> "f_i_fkey" on table "f"
> DETAIL: Key (i)=(16384) is still referenced from table "f".
> Time: 5.561 ms
>
>
> Cascaded DELETE --- many PK rows
> ================================
>
> DROP TABLE f;
> CREATE TABLE f(i int REFERENCES p ON UPDATE CASCADE ON DELETE CASCADE);
> INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);
>
> master:
>
> EXPLAIN ANALYZE DELETE FROM p;
> QUERY PLAN
>
> -----------------------------------------------------------------------------------------------------------
> Delete on p (cost=0.00..236.84 rows=16384 width=6) (actual
> time=38.334..38.334 rows=0 loops=1)
> -> Seq Scan on p (cost=0.00..236.84 rows=16384 width=6) (actual
> time=0.019..3.925 rows=16384 loops=1)
> Planning Time: 0.049 ms
> Trigger for constraint f_i_fkey: time=31348.756 calls=16384
> Execution Time: 31390.784 ms
>
> patched:
>
> EXPLAIN ANALYZE DELETE FROM p;
> QUERY PLAN
>
> -----------------------------------------------------------------------------------------------------------
> Delete on p (cost=0.00..236.84 rows=16384 width=6) (actual
> time=33.360..33.360 rows=0 loops=1)
> -> Seq Scan on p (cost=0.00..236.84 rows=16384 width=6) (actual
> time=0.012..3.183 rows=16384 loops=1)
> Planning Time: 0.094 ms
> Trigger for constraint f_i_fkey: time=9.580 calls=8
> Execution Time: 43.941 ms
>
>
> Cascaded DELETE --- a single PK row
> ===================================
>
> INSERT INTO p SELECT x FROM generate_series(1, 16384) g(x);
> INSERT INTO f SELECT i FROM generate_series(1, 16384) g(i);
>
> master:
>
> DELETE FROM p WHERE i=16384;
> DELETE 1
> Time: 5.754 ms
>
> patched:
>
> DELETE FROM p WHERE i=16384;
> DELETE 1
> Time: 8.098 ms
>
>
> Cascaded UPDATE - many rows
> ===========================
>
> master:
>
> EXPLAIN ANALYZE UPDATE p SET i = i + 16384;
> QUERY PLAN
>
> ------------------------------------------------------------------------------------------------------------
> Update on p (cost=0.00..277.80 rows=16384 width=10) (actual
> time=166.954..166.954 rows=0 loops=1)
> -> Seq Scan on p (cost=0.00..277.80 rows=16384 width=10) (actual
> time=0.013..7.780 rows=16384 loops=1)
> Planning Time: 0.177 ms
> Trigger for constraint f_i_fkey on p: time=60405.362 calls=16384
> Trigger for constraint f_i_fkey on f: time=455.874 calls=16384
> Execution Time: 61036.996 ms
>
> patched:
>
> EXPLAIN ANALYZE UPDATE p SET i = i + 16384;
> QUERY PLAN
>
> ------------------------------------------------------------------------------------------------------------
> Update on p (cost=0.00..277.77 rows=16382 width=10) (actual
> time=159.512..159.512 rows=0 loops=1)
> -> Seq Scan on p (cost=0.00..277.77 rows=16382 width=10) (actual
> time=0.014..7.783 rows=16382 loops=1)
> Planning Time: 0.146 ms
> Trigger for constraint f_i_fkey on p: time=169.628 calls=9
> Trigger for constraint f_i_fkey on f: time=124.079 calls=2
> Execution Time: 456.072 ms
>
>
> Cascaded UPDATE - a single row
> ==============================
>
> master:
>
> UPDATE p SET i = i - 16384 WHERE i=32767;
> UPDATE 1
> Time: 4.858 ms
>
> patched:
>
> UPDATE p SET i = i - 16384 WHERE i=32767;
> UPDATE 1
> Time: 11.955 ms
>
>
> [1] https://commitfest.postgresql.org/22/1975/
>
> --
> Antonin Houska
> Web: https://www.cybertec-postgresql.com
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-04-08 17:07:54 Re: A problem about partitionwise join
Previous Message Stephen Frost 2020-04-08 17:05:28 Re: where should I stick that backup?