RE: Avoid CommandCounterIncrement in RI trigger when INSERT INTO referencing table

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Avoid CommandCounterIncrement in RI trigger when INSERT INTO referencing table
Date: 2021-03-11 12:01:15
Message-ID: OS0PR01MB57164B2AC1677D7FC617573994909@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> > After some more on how to support parallel insert into fk relation.
> > It seems we do not have a cheap way to implement this feature.
> >
> > In RI_FKey_check, Currently, postgres execute "select xx for key
> > share" to check that foreign key exists in PK table.
> > However "select for update/share" is considered as parallel unsafe. It
> > may be dangerous to do this in parallel mode, we may want to change this.
>
> Hmm, I guess the parallel leader and workers can execute SELECT FOR KEY
> SHARE, if the parallelism infrastructure allows execution of SPI calls. The lock
> manager supports tuple locking in parallel leader and workers by the group
> locking. Also, the tuple locking doesn't require combo Cid, which is necessary
> for parallel UPDATE and DELETE.
>
> Perhaps the reason why SELECT FOR is treated as parallel-unsafe is that tuple
> locking modifies data pages to store lock information in the tuple header. But
> now, page modification is possible in parallel processes, so I think we can
> consider SELECT FOR as parallel-safe. (I may be too optimistic.)

I think you are right.
After reading the original parallel-safety check's commit message , README.tuplock, and have some discussions with the author.
I think the reason why [SELECT FOR UPDATE/SHARE] is parallel unsafe is that [SELECT FOR] will call GetCurrentCommandId(true).
GetCurrentCommandId(true) was not supported in parallel worker but [SELECT FOR] need command ID to mark the change(lock info).

Fortunately, With greg's parallel insert patch, we can use command ID in parallel worker.
So, IMO, In parallel insert case, the RI check is parallel safe.

> > And also, "select for update/share" is considered as "not read only"
> > which will force readonly = false in _SPI_execute_plan.
>
> read_only is used to do CCI. Can we arrange to skip CCI?

Yes, we can.
Currently, I try to add one parameter(need CCI) to SPI_execute_snapshot and _SPI_execute_plan to control CCI.
I was still researching is there a more elegant way.

> > At the same time, " simplifying foreign key/RI checks " thread is
> > trying to replace "select xx for key share" with
> > index_beginscan()+table_tuple_lock() (I think it’s parallel safe).
> > May be we can try to support parallel insert fk relation after "
> > simplifying foreign key/RI checks " patch applied ?
>
> Why do you think it's parallel safe?
>
> Can you try running parallel INSERT SELECT on the target table with FK and see
> if any problem happens?

I have tested this in various cases:
All the test results looks good.
* test different worker lock on the same tuple.
* test different worker lock on different tuple.
* test no lock.
* test lock with concurrent update
* test constraint error.


>
> Surprisingly, Oracle doesn't support parallel INSERT SELECT on a table with FK
> as follows. SQL Server doesn't mention anything, so I guess it's supported.
> This is a good chance for PostgreSQL to exceed Oracle.
>
> https://docs.oracle.com/en/database/oracle/oracle-database/21/vldbg/types-parallelism.html#GUID-D4CFC1F2-44D3-4BE3-B5ED-6A309EB8BF06
>
> Table 8-1 Referential Integrity Restrictions
> DML Statement Issued on Parent Issued on Child
> Self-Referential
> INSERT (Not applicable) Not parallelized Not parallelized

Ah, that's really good chance.

To support parallel insert into FK relation.
There are two scenarios need attention.
1) foreign key and primary key are on the same table(INSERT's target table).
(referenced and referencing are the same table)
2) referenced and referencing table are both partition of INSERT's target table.
(These cases are really rare for me)

In the two cases, the referenced table could be modified when INSERTing and CCI is necessary,
So, I think we should treat these cases as parallel restricted while doing safety check.

Attaching V1 patch that Implemented above feature and passed regression test.

Best regards,
houzj

Attachment Content-Type Size
v1_0002-WIP-safety-check.patch application/octet-stream 4.6 KB
v1_0001-WIP-avoid-cci.patch application/octet-stream 9.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiro Ikeda 2021-03-11 12:29:38 Re: About to add WAL write/fsync statistics to pg_stat_wal view
Previous Message Julien Rouhaud 2021-03-11 11:42:56 Re: first add newly loaded plugin to the list then invoke _PG_init