RE: Parallel Inserts in CREATE TABLE AS

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: 'Bharath Rupireddy' <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: RE: Parallel Inserts in CREATE TABLE AS
Date: 2021-05-27 02:18:16
Message-ID: TYAPR01MB29900565139996B7EEB04B0EFE239@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for the detailed analysis, I'll look into it too. (The times have changed...)

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
> Well, one might think to add more blocks at a time, say
> Min(1024, lockWaiters * 128/256/512) than currently extraBlocks =
> Min(512, lockWaiters * 20);. This will work (i.e. we don't see any
> regression with parallel inserts in CTAS patches), but it can't be a
> practical solution. Because the total pages for the relation will be
> more with many pages having more free space. Furthermore, the future
> sequential scans on that relation might take a lot of time.

> Otherwise, the similar speed up can be observed when the BAS_BULKWRITE
> is increased a bit from the current 16MB to some other reasonable
> value. I earlier tried these experiments.
>
> Otherwise, as I said in [1], we can also increase the number of extra
> blocks added at a time, say Min(1024, lockWaiters * 128/256/512) than
> currently extraBlocks = Min(512, lockWaiters * 20);. This will also
> give some speedup and we don't see any regression with parallel
> inserts in CTAS patches.
>
> But, I'm not so sure that the hackers will agree any of the above as a
> practical solution to the "relation extension" problem.

I think I understand your concern about resource consumption and impact on other concurrently running jobs (OLTP, data analysis.)

OTOH, what's the situation like when the user wants to run CTAS, and further, wants to speed it up by using parallelism? isn't it okay to let the (parallel) CTAS use as much as it wants? At least, I think we can provide another mode for it, like Oracle provides conditional path mode and direct path mode for INSERT and data loading.

What do we want to do to maximize parallel CTAS speedup if we were a bit unshackled from the current constraints (alignment with existing code, impact on other concurrent workloads)?

* Use as many shared buffers as possible to decrease WAL flush.
Otherwise, INSERT SELECT may be faster?

* Minimize relation extension (= increase the block count per extension)
posix_fallocate() would help too.

* Allocate added pages among parallel workers, and each worker fills pages to their full capacity.
The worker that extended the relation stores the page numbers of added pages in shared memory for parallel execution. Each worker gets a page from there after waiting for the relation extension lock, instead of using FSM.
The last pages that the workers used will be filled halfway, but the amount of unused space should be low compared to the total table size.

Regards
Takayuki Tsunakawa

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-05-27 02:24:12 Re: Move pg_attribute.attcompression to earlier in struct for reduced size?
Previous Message Michael Paquier 2021-05-27 02:17:03 Re: Speed up pg_checksums in cases where checksum already set