RE: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: RE: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)
Date: 2021-08-03 07:40:22
Message-ID: OS0PR01MB5716DB1E3F723F86314D080094F09@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Based on the discussion here, I implemented the auto-safety-check feature.
Since most of the technical discussion happened here,I attatched the patches in
this thread.

The patches allow users to specify a parallel-safety option for both
partitioned and non-partitioned relations, and for non-partitioned relations if
users didn't specify, it would be computed automatically. If the user has
specified parallel-safety option then we would consider that instead of
computing the value by ourselves. But for partitioned table, if users didn't
specify the parallel dml safety, it will treat is as unsafe.

For non-partitioned relations, after computing the parallel-safety of relation
during the planning, we save it in the relation cache entry and invalidate the
cached parallel-safety for all relations in relcache for a particular database
whenever any function's parallel-safety is changed.

To make it possible for user to alter the safety to a not specified value to
get the automatic safety check, add a new default option(temporarily named
'DEFAULT' in addition to safe/unsafe/restricted) about parallel dml safety.

To facilitate users for providing a parallel-safety option, provide a utility
functionr "pg_get_table_parallel_dml_safety(regclass)" that returns records of
(objid, classid, parallel_safety) for all parallel unsafe/restricted
table-related objects from which the table's parallel DML safety is determined.
This will allow user to identify unsafe objects and if the required user can
change the parallel safety of required functions and then use the parallel
safety option for the table.

Best regards,
houzj

Attachment Content-Type Size
0006-hack-the-rewriter-bug.patch application/octet-stream 1.4 KB
v15-0001-CREATE-ALTER-TABLE-PARALLEL-DML.patch application/octet-stream 44.1 KB
v15-0002-parallel-SELECT-for-INSERT.patch application/octet-stream 9.9 KB
v15-0003-get-parallel-safety-functions.patch application/octet-stream 31.6 KB
v15-0004-cache-parallel-dml-safety.patch application/octet-stream 11.9 KB
v15-0005-regression-test-and-doc-updates.patch application/octet-stream 110.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-08-03 07:50:24 Re: straightening out backend process startup
Previous Message Soumyadeep Chakraborty 2021-08-03 07:06:34 Re: Changes to recovery_min_apply_delay are ignored while waiting for delay