|From:||amul sul <sulamul(at)gmail(dot)com>|
|To:||Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>|
|Cc:||Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>|
|Subject:||Re: [HACKERS] Parallel Append implementation|
|Views:||Raw Message | Whole Thread | Download mbox|
On Tue, Nov 21, 2017 at 2:22 PM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> On 21 November 2017 at 12:44, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com> wrote:
>> On Mon, Nov 13, 2017 at 12:54 PM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
>>> Thanks a lot Robert for the patch. I will have a look. Quickly tried
>>> to test some aggregate queries with a partitioned pgbench_accounts
>>> table, and it is crashing. Will get back with the fix, and any other
>>> review comments.
>>> -Amit Khandekar
>> I was trying to get the performance of this patch at commit id -
>> 11e264517dff7a911d9e6494de86049cab42cde3 and TPC-H scale factor 20
>> with the following parameter settings,
>> work_mem = 1 GB
>> shared_buffers = 10GB
>> effective_cache_size = 10GB
>> max_parallel_workers_per_gather = 4
>> enable_partitionwise_join = on
>> and the details of the partitioning scheme is as follows,
>> tables partitioned = lineitem on l_orderkey and orders on o_orderkey
>> number of partitions in each table = 10
>> As per the explain outputs PA was used in following queries- 1, 3, 4,
>> 5, 6, 7, 8, 10, 12, 14, 15, 18, and 21.
>> Unfortunately, at the time of executing any of these query, it is
>> crashing with the following information in core dump of each of the
>> Program terminated with signal 11, Segmentation fault.
>> #0 0x0000000010600984 in pg_atomic_read_u32_impl (ptr=0x3ffffec29294)
>> at ../../../../src/include/port/atomics/generic.h:48
>> 48 return ptr->value;
>> In case this a different issue as you pointed upthread, you may want
>> to have a look at this as well.
>> Please let me know if you need any more information in this regard.
> Right, for me the crash had occurred with a similar stack, although
> the real crash happened in one of the workers. Attached is the script
> pgbench_partitioned.sql to create a schema with which I had reproduced
> the crash.
> The query that crashed :
> select sum(aid), avg(aid) from pgbench_accounts;
> Set max_parallel_workers_per_gather to at least 5.
> Also attached is v19 patch rebased.
I've spent little time to debug this crash. The crash happens in ExecAppend()
due to subnode in node->appendplans array is referred using incorrect
array index (out of bound value) in the following code:
* figure out which subplan we are currently processing
subnode = node->appendplans[node->as_whichplan];
This incorrect value to node->as_whichplan is get assigned in the
By doing following change on the v19 patch does the fix for me:
@@ -489,11 +489,9 @@ choose_next_subplan_for_worker(AppendState *node)
/* Pick the plan we found, and advance pa_next_plan one more time. */
- node->as_whichplan = pstate->pa_next_plan;
+ node->as_whichplan = pstate->pa_next_plan++;
if (pstate->pa_next_plan == node->as_nplans)
pstate->pa_next_plan = append->first_partial_plan;
/* If non-partial, immediately mark as finished. */
if (node->as_whichplan < append->first_partial_plan)
Attaching patch does same changes to Amit's ParallelAppend_v19_rebased.patch.
|Next Message||David CARLIER||2017-11-21 12:08:46||[PATCH] using arc4random for strong randomness matters.|
|Previous Message||Amit Khandekar||2017-11-21 11:54:35||Re: [HACKERS] UPDATE of partition key|