Re: Parallel Inserts in CREATE TABLE AS

From: Luc Vlaming <luc(at)swarm64(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: Parallel Inserts in CREATE TABLE AS
Date: 2020-10-16 06:03:16
Message-ID: 4e2b486c-a6d3-35f3-aae6-49ac944add09@swarm64.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 14.10.20 11:16, Bharath Rupireddy wrote:
> On Tue, Oct 6, 2020 at 10:58 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>>> Yes we do a bunch of catalog changes related to the created new table.
>>> We will have both the txn id and command id assigned when catalogue
>>> changes are being made. But, right after the table is created in the
>>> leader, the command id is incremented (CommandCounterIncrement() is
>>> called from create_ctas_internal()) whereas the txn id remains the
>>> same. The new command id is marked as GetCurrentCommandId(true); in
>>> intorel_startup, then the parallel mode is entered. The txn id and
>>> command id are serialized into parallel DSM, they are then available
>>> to all parallel workers. This is discussed in [1].
>>>
>>> Few changes I have to make in the parallel worker code: set
>>> currentCommandIdUsed = true;, may be via a common API
>>> SetCurrentCommandIdUsedForWorker() proposed in [1] and remove the
>>> extra command id sharing from the leader to workers.
>>>
>>> I will add a few comments in the upcoming patch related to the above info.
>>>
>>
>> Yes, that would be good.
>>
>
> Added comments.
>
>>
>>>> But how does that work for SELECT INTO? Are you prohibiting
>>>> that? ...
>>>>
>>>
>>> In case of SELECT INTO, a new table gets created and I'm not
>>> prohibiting the parallel inserts and I think we don't need to.
>>>
>>
>> So, in this case, also do we ensure that table is created before we
>> launch the workers. If so, I think you can explain in comments about
>> it and what you need to do that to ensure the same.
>>
>
> For SELECT INTO, the table gets created by the leader in
> create_ctas_internal(), then ExecInitParallelPlan() gets called which
> launches the workers and then the leader(if asked to do so) and the
> workers insert the rows. So, we don't need to do any extra work to
> ensure the table gets created before the workers start inserting
> tuples.
>
>>
>> While skimming through the patch, a small thing I noticed:
>> + /*
>> + * SELECT part of the CTAS is parallelizable, so we can make
>> + * each parallel worker insert the tuples that are resulted
>> + * in it's execution into the target table.
>> + */
>> + if (!is_matview &&
>> + IsA(plan->planTree, Gather))
>> + ((DR_intorel *) dest)->is_parallel = true;
>> +
>>
>> I am not sure at this stage if this is the best way to make CTAS as
>> parallel but if so, then probably you can expand the comments a bit to
>> say why you consider only Gather node (and that too when it is the
>> top-most node) and why not another parallel node like GatherMerge?
>>
>
> If somebody expects to preserve the order of the tuples that are
> coming from GatherMerge node of the select part in CTAS or SELECT INTO
> while inserting, now if parallelism is allowed, that may not be the
> case i.e. the order of insertion of tuples may vary. I'm not quite
> sure, if someone wants to use order by in the select parts of CTAS or
> SELECT INTO in a real world use case. Thoughts?
>
>>
>> Right, for now, I think you can simply remove that check from the code
>> instead of just commenting it. We will see if there is a better
>> check/Assert we can add there.
>>
>
> Done.
>
> I also worked on some of the open points I listed earlier in my mail.
>
>>
>> 3. Need to restrict parallel inserts, if CTAS tries to create temp/global tables as the workers will not have access to those tables.
>>
>
> Done.
>
>>
>> Need to analyze whether to allow parallelism if CTAS has prepared statements or with no data.
>>
>
> For prepared statements, the parallelism will not be picked and so is
> parallel insertion.
> For CTAS with no data option case the select part is not even planned,
> and so the parallelism will also not be picked.
>
>>
>> 4. Need to stop unnecessary parallel shared state such as tuple queue being created and shared to workers.
>>
>
> Done.
>
> I'm listing the things that are still pending.
>
> 1. How to represent the parallel insert for CTAS in explain plans? The
> explain CTAS shows the plan for only the SELECT part. How about having
> some textual info along with the Gather node? I'm not quite sure on
> this point, any suggestions are welcome.
> 2. Addition of new test cases. Testing with more scenarios and
> different data sets, sizes, tablespaces, select into. Analysis on the
> 2 mismatches in write_parallel.sql regression test.
>
> Attaching v2 patch, thoughts and comments are welcome.
>
> With Regards,
> Bharath Rupireddy.
> EnterpriseDB: http://www.enterprisedb.com
>

Hi,

Really looking forward to this ending up in postgres as I think it's a
very nice improvement.

Whilst reviewing your patch I was wondering: is there a reason you did
not introduce a batch insert in the destreceiver for the CTAS? For me
this makes a huge difference in ingest speed as otherwise the inserts do
not really scale so well as lock contention start to be a big problem.
If you like I can make a patch to introduce this on top?

Kind regards,
Luc
Swarm64

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2020-10-16 06:20:23 Re: [HACKERS] logical decoding of two-phase transactions
Previous Message Alvaro Herrera 2020-10-16 05:45:51 Re: upcoming API changes for LLVM 12