From: | "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com> |
---|---|
To: | 'Greg Nancarrow' <gregn4422(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | RE: Parallel INSERT (INTO ... SELECT ...) |
Date: | 2020-12-10 02:23:54 |
Message-ID: | OSBPR01MB298274BCB8959932191EA9A6FECB0@OSBPR01MB2982.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
> Firstly, in order to perform parallel-safety checks in the case of partitions, the
> patch currently recursively locks/unlocks
> (AccessShareLock) each partition during such checks (as each partition may
> itself be a partitioned table). Is there a better way of performing the
> parallel-safety checks and reducing the locking requirements?
First of all, as you demonstrated the planning time and execution time of parallel insert, I think the increased planning time is negligible when the parallel insert is intentionally used for loading large amount of data. However, it's a problem if the overhead is imposed on OLTP transactions. Does the overhead occur with the default values of max_parallel_workers_per_gather = 2 and max_parall_workers = 8?
To avoid this heavy checking during planning, I'm wondering if we can have an attribute in pg_class, something like relhasindexes and relhas triggers. The concerning point is that we have to maintain the accuracy of the value when dropping ancillary objects around the table/partition.
> Secondly, I found that when running "make check-world", the
> "partition-concurrent-attach" test fails, because it is expecting a partition
> constraint to be violated on insert, while an "alter table attach partition ..." is
> concurrently being executed in another transaction. Because of the partition
> locking done by the patch's parallel-safety checking code, the insert blocks on
> the exclusive lock held by the "alter table" in the other transaction until the
> transaction ends, so the insert ends up successfully completing (and thus fails
> the test) when the other transaction ends. To overcome this test failure, the
> patch code was updated to instead perform a conditional lock on the partition,
> and on failure (i.e. because of an exclusive lock held somewhere else), just
> assume it's parallel-unsafe because the parallel-safety can't be determined
> without blocking on the lock. This is not ideal, but I'm not sure of what other
> approach could be used and I am somewhat reluctant to change that test. If
> anybody is familiar with the "partition-concurrent-attach" test, any ideas or
> insights would be appreciated.
That test looks sane. I think what we should do is to disable parallel operation during that test. It looks like some of other existing test cases disable parallel query by setting max_parallel_workers_per_gather to 0. It's not strange that some tests fail with some configuration. autovacuum is disabled in many places of the regression test.
Rather, I don't think we should introduce the trick to use ConditionalLockAcquire(). Otherwise, the insert would be executed in a serial fashion without the user knowing it -- "What? The insert suddenly slowed down multiple times today, and it didn't finish within the planned maintenance window. What's wrong?"
Regards
Takayuki Tsunakawa
From | Date | Subject | |
---|---|---|---|
Next Message | Zhihong Yu | 2020-12-10 02:35:39 | Re: Deleting older versions in unique indexes to avoid page splits |
Previous Message | Zhihong Yu | 2020-12-10 02:19:05 | Re: Parallel Inserts in CREATE TABLE AS |