Re: Parallel INSERT (INTO ... SELECT ...)

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Greg Nancarrow <gregn4422(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-03-03 14:07:21
Message-ID: CAFiTN-sbKVNo+i+obw2KuANt5SKr=g8qTBT_4KaLcgj99shPeg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 3, 2021 at 5:50 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
>
> Asserts are normally only enabled in a debug-build, so for a
> release-build that Assert has no effect.
> The Assert is being used as a sanity-check that the function is only
> currently getting called for INSERT, because that's all it currently
> supports.

I agree that assert is only for debug build, but once we add and
assert that means we are sure that it should only be called for insert
and if it is called for anything else then it is a programming error
from the caller's side. So after the assert, adding if check for the
same condition doesn't look like a good idea. That means we think
that the code can hit assert in the debug mode so we need an extra
protection in the release mode.

>
> > 2.
> > In patch 0004, We are still charging the parallel_tuple_cost for each
> > tuple, are we planning to do something about this? I mean after this
> > patch tuple will not be transferred through the tuple queue, so we
> > should not add that cost.
> >
>
> I believe that for Parallel INSERT, cost_modifytable() will set
> path->path.rows to 0 (unless there is a RETURNING list), so, for
> example, in cost_gather(), it will not add to the run_cost as
> "run_cost += parallel_tuple_cost * path->path.rows;"
>

But the cost_modifytable is setting the number of rows to 0 in
ModifyTablePath whereas the cost_gather will multiply the rows from
the GatherPath. I can not see the rows from GatherPath is ever set to
0.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2021-03-03 14:07:30 Re: Let people set host(no)ssl settings from initdb
Previous Message Amit Langote 2021-03-03 13:59:35 Re: Huge memory consumption on partitioned table with FKs