RE: Parallel INSERT SELECT take 2

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Greg Nancarrow <gregn4422(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: RE: Parallel INSERT SELECT take 2
Date: 2021-05-24 05:15:46
Message-ID: OS0PR01MB57167789547D8F678ED5A20594269@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
Sent: Wednesday, May 19, 2021 7:55 PM
>
> On Fri, May 14, 2021 at 6:24 PM houzj(dot)fnst(at)fujitsu(dot)com
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > Thanks for the comments, I have posted new version patches with this
> change.
> >
> > > How about reorganisation of the patches like the following?
> > > 0001: CREATE ALTER TABLE PARALLEL DML
> > > 0002: parallel-SELECT-for-INSERT (planner changes,
> > > max_parallel_hazard() update, XID changes)
> > > 0003: pg_get_parallel_safety()
> > > 0004: regression test updates
> >
> > Thanks, it looks good and I reorganized the latest patchset in this way.
> >
> > Attaching new version patches with the following change.
> >
> > 0003
> > Change functions arg type to regclass.
> >
> > 0004
> > remove updates for "serial_schedule".
> >
>
> I've got some comments for the V4 set of patches:
>
> (0001)
>
> (i) Patch comment needs a little updating (suggested change is below):
>
> Enable users to declare a table's parallel data-modification safety
> (SAFE/RESTRICTED/UNSAFE).
>
> Add a table property that represents parallel safety of a table for
> DML statement execution.
> It may be specified as follows:
>
> CREATE TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE };
> ALTER TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE };
>
> This property is recorded in pg_class's relparallel column as 'u',
> 'r', or 's', just like pg_proc's proparallel.
> The default is UNSAFE.
>
> The planner assumes that all of the table, its descendant partitions,
> and their ancillary objects have,
> at worst, the specified parallel safety. The user is responsible for
> its correctness.
>
> ---
>
> NOTE: The following sentence was removed from the original V4 0001
> patch comment (since this version of the patch is not doing runtime
> parallel-safety checks on functions):.
>
> If the parallel processes
> find an object that is less safer than the assumed parallel safety during
> statement execution, it throws an ERROR and abort the statement execution.
>
>
> (ii) Update message to say "a foreign ...":
>
> BEFORE:
> + errmsg("cannot support parallel data modification on foreign or
> temporary table")));
>
> AFTER:
> + errmsg("cannot support parallel data modification on a foreign or
> temporary table")));
>
>
> (iii) strVal() macro already casts to "Value *", so the cast can be
> removed from the following:
>
> + char *parallel = strVal((Value *) def);
>
>
> (0003)
>
> (i) Suggested updates to the patch comment:
>
> Provide a utility function "pg_get_parallel_safety(regclass)" that
> returns records of
> (objid, classid, parallel_safety) for all parallel unsafe/restricted
> table-related objects
> from which the table's parallel DML safety is determined. The user can
> use this information
> during development in order to accurately declare a table's parallel
> DML safety. or to
> identify any problematic objects if a parallel DML fails or behaves
> unexpectedly.
>
> When the use of an index-related parallel unsafe/restricted function
> is detected, both the
> function oid and the index oid are returned.
>
> Provide a utility function "pg_get_max_parallel_hazard(regclass)" that
> returns the worst
> parallel DML safety hazard that can be found in the given relation.
> Users can use this
> function to do a quick check without caring about specific
> parallel-related objects.

Thanks for the comments and your descriptions looks good.
Attaching v5 patchset with all these changes.

Best regards,
houzj

Attachment Content-Type Size
v5-POC-0002-parallel-SELECT-for-INSERT.patch application/octet-stream 9.9 KB
v5-POC-0003-get-parallel-safety-functions.patch application/octet-stream 28.1 KB
v5-POC-0004-regression-test-updates.patch application/octet-stream 32.6 KB
v5-POC-0001-CREATE-ALTER-TABLE-PARALLEL-DML.patch application/octet-stream 37.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Nancarrow 2021-05-24 05:21:44 Re: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump
Previous Message Tom Lane 2021-05-24 05:05:07 Re: Move pg_attribute.attcompression to earlier in struct for reduced size?