Re: Parallel INSERT (INTO ... SELECT ...)

From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-02-26 06:35:14
Message-ID: CAJcOf-ewbpFGfLJy0dA24KiZTdPtnkmYtMsaoT8_UAjEQJoZww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 26, 2021 at 4:07 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>
> Hi Greg,
>
> Replying to an earlier email in the thread because I think I found a
> problem relevant to the topic that was brought up.
>
> On Wed, Feb 17, 2021 at 10:34 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > On Wed, Feb 17, 2021 at 10:44 AM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> > > Is the use of "table_close(rel, NoLock)'' intentional? That will keep
> > > the lock (lockmode) until end-of-transaction.
> >
> > I think we always keep any locks on relations that are involved in a
> > plan until end-of-transaction. What if a partition is changed in an
> > unsafe manner between being considered safe for parallel insertion and
> > actually performing the parallel insert?
>
> I realized that there is a race condition that will allow a concurrent
> backend to invalidate a parallel plan (for insert into a partitioned
> table) at a point in time when it's too late for plancache.c to detect
> it. It has to do with how plancache.c locks the relations affected by
> a cached query and its cached plan. Specifically,
> AcquireExecutorLocks(), which locks the relations that need to be
> locked before the plan could be considered safe to execute, does not
> notice the partitions that would have been checked for parallel safety
> when the plan was made. Given that AcquireExecutorLocks() only loops
> over PlannedStmt.rtable and locks exactly the relations found there,
> partitions are not locked. This means that a concurrent backend can,
> for example, add an unsafe trigger to a partition before a parallel
> worker inserts into it only to fail when it does.
>
> Steps to reproduce (tried with v19 set of patches):
>
> drop table if exists rp, foo;
> create table rp (a int) partition by range (a);
> create table rp1 partition of rp for values from (minvalue) to (0);
> create table rp2 partition of rp for values from (0) to (maxvalue);
> create table foo (a) as select generate_series(1, 1000000);
> set plan_cache_mode to force_generic_plan;
> set enable_parallel_dml to on;
> deallocate all;
> prepare q as insert into rp select * from foo where a%2 = 0;
> explain analyze execute q;
>
> At this point, attach a debugger to this backend and set a breakpoint
> in AcquireExecutorLocks() and execute q again:
>
> -- will execute the cached plan
> explain analyze execute q;
>
> Breakpoint will be hit. Continue till return from the function and
> stop. Start another backend and execute this:
>
> -- will go through, because no lock still taken on the partition
> create or replace function make_table () returns trigger language
> plpgsql as $$ begin create table bar(); return null; end; $$ parallel
> unsafe;
> create trigger ai_rp2 after insert on rp2 for each row execute
> function make_table();
>
> Back to the debugger, hit continue to let the plan execute. You
> should see this error:
>
> ERROR: cannot start commands during a parallel operation
> CONTEXT: SQL statement "create table bar()"
> PL/pgSQL function make_table() line 1 at SQL statement parallel worker
>
> The attached patch fixes this, although I am starting to have second
> thoughts about how we're tracking partitions in this patch. Wondering
> if we should bite the bullet and add partitions into the main range
> table instead of tracking them separately in partitionOids, which
> might result in a cleaner patch overall.
>

Thanks Amit,

I was able to reproduce the problem using your instructions (though I
found I had to run that explain an extra time, in order to hit the
breakpoint).
Also, I can confirm that the problem doesn't occur after application
of your patch.
I'll leave it to your better judgement as to what to do next - if you
feel the current tracking method is not sufficient

Regards,
Greg Nancarrow
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael J. Baars 2021-02-26 06:41:15 Re: Postgresql network transmission overhead
Previous Message tsunakawa.takay@fujitsu.com 2021-02-26 06:34:09 RE: Global snapshots