Re: Parallel INSERT (INTO ... SELECT ...)

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Greg Nancarrow <gregn4422(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-03-04 09:05:10
Message-ID: CAFiTN-tpvraspVM_ahf-ifvCT2J2yVs6qSbXC59Ebsa_mK+jiA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 4, 2021 at 7:16 AM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
>
> On Thu, Mar 4, 2021 at 1:07 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Wed, Mar 3, 2021 at 5:50 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> > >
> > I agree that assert is only for debug build, but once we add and
> > assert that means we are sure that it should only be called for insert
> > and if it is called for anything else then it is a programming error
> > from the caller's side. So after the assert, adding if check for the
> > same condition doesn't look like a good idea. That means we think
> > that the code can hit assert in the debug mode so we need an extra
> > protection in the release mode.
> >
>
> The if-check isn't there for "extra protection".
> It's to help with future changes; inside that if-block is code only
> applicable to INSERT (and to UPDATE - sorry, before I said DELETE), as
> the code-comment indicates, whereas the rest of the function is
> generic to all command types. I don't see any problem with having this
> if-block here, to help in this way, when other command types are
> added.

If that is the case then this check should also be added along with
that future patches, I mean when we will allow UPDATE then it makes
sense of that check and that time will have to get rid of that assert
as well. I mean complete function will be in sync. But now this
check looks a bit odd. I think that is my opinion but otherwise, I
don't have any strong objection to that check.

> > >
> > > > 2.
> > But the cost_modifytable is setting the number of rows to 0 in
> > ModifyTablePath whereas the cost_gather will multiply the rows from
> > the GatherPath. I can not see the rows from GatherPath is ever set to
> > 0.
> >
>
> OK, I see the problem now.
> It works the way I described, but currently there's a problem with
> where it's getting the rows for the GatherPath, so there's a
> disconnect.
> When generating the GatherPaths, it's currently always taking the
> rel's (possibly estimated) row-count, rather than using the rows from
> the cheapest_partial_path (the subpath: ModifyTablePath). See
> generate_gather_paths().
> So when generate_useful_gather_paths() is called from the planner, for
> the added partial paths for Parallel INSERT, it should be passing
> "true" for the "override_rows" parameter, not "false".
>
> So I think that in the 0004 patch, the if-block:
>
> + if (parallel_modify_partial_path_added)
> + {
> + final_rel->rows = current_rel->rows; /* ??? why
> hasn't this been
> +
> * set above somewhere ???? */
> + generate_useful_gather_paths(root, final_rel, false);
> + }
> +
>
> can be reduced to:
>
> + if (parallel_modify_partial_path_added)
> + generate_useful_gather_paths(root, final_rel, true);
> +

Okay. I will check this part after you provide an updated version. Thanks.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2021-03-04 09:06:19 Re: Parallel INSERT (INTO ... SELECT ...)
Previous Message Heikki Linnakangas 2021-03-04 08:59:41 Re: Removing support for COPY FROM STDIN in protocol version 2