Re: Parallel INSERT (INTO ... SELECT ...)

From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-03-01 03:38:21
Message-ID: CAJcOf-detRuxh43MDgfqNFseK1KLRYtG2mi4g95=YUKXcWfzjA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 26, 2021 at 5:50 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>
> On Fri, Feb 26, 2021 at 3:35 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> > On Fri, Feb 26, 2021 at 4:07 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > > The attached patch fixes this, although I am starting to have second
> > > thoughts about how we're tracking partitions in this patch. Wondering
> > > if we should bite the bullet and add partitions into the main range
> > > table instead of tracking them separately in partitionOids, which
> > > might result in a cleaner patch overall.
> >
> > Thanks Amit,
> >
> > I was able to reproduce the problem using your instructions (though I
> > found I had to run that explain an extra time, in order to hit the
> > breakpoint).
> > Also, I can confirm that the problem doesn't occur after application
> > of your patch.
> > I'll leave it to your better judgement as to what to do next - if you
> > feel the current tracking method is not sufficient
>
> Just to be clear, I think the tracking method added by the patch is
> sufficient AFAICS for the problems we were able to discover. The
> concern I was trying to express is that we seem to be duct-taping
> holes in our earlier chosen design to track partitions separately from
> the range table. If we had decided to add partitions to the range
> table as "extra" target relations from the get-go, both the issues I
> mentioned with cached plans -- partitions not being counted as a
> dependency and partitions not being locked before execution -- would
> have been prevented. I haven't fully grasped how invasive that design
> would be, but it sure sounds like it would be a bit more robust.
>

Posting an updated set of patches that includes Amit Langote's patch
to the partition tracking scheme...
(the alternative of adding partitions to the range table needs further
investigation)

Regards,
Greg Nancarrow
Fujitsu Australia

Attachment Content-Type Size
v20-0001-Enable-parallel-SELECT-for-INSERT-INTO-.-SELECT.patch application/octet-stream 33.0 KB
v20-0002-Parallel-SELECT-for-INSERT-INTO-.-SELECT-tests-and-doc.patch application/octet-stream 69.8 KB
v20-0003-Add-new-parallel-dml-GUC-and-table-options.patch application/octet-stream 19.4 KB
v20-0004-Enable-parallel-INSERT-and-or-SELECT-for-INSERT-INTO.patch application/octet-stream 44.4 KB
v20-0005-Parallel-INSERT-and-or-SELECT-for-INSERT-INTO-tests-and-doc.patch application/octet-stream 22.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-03-01 03:55:04 Re: Update docs of logical replication for commit ce0fdbfe97.
Previous Message Thomas Munro 2021-03-01 03:21:53 Re: Reducing WaitEventSet syscall churn