Re: Proposal : Parallel Merge Join

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal : Parallel Merge Join
Date: 2016-12-21 04:35:14
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers

On Tue, Dec 13, 2016 at 8:34 PM, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> I have created two patches as per the suggestion.
> 1. mergejoin_refactoring_v2.patch --> Move functionality of
> considering various merge join path into new function.
> 2. parallel_mergejoin_v2.patch -> This adds the functionality of
> supporting partial mergejoin paths. This will apply on top of
> mergejoin_refactoring_v2.patch.

We have done further analysis of the performance with TPCH benchmark
at higher scale factor. I have tested parallel merge join patch along
with parallel index scan[1]

I have observed that with query3, we are getting linear scalability
('explain analyze' results are attached).

Test Setup:
TPCH 300 scale factor
work_mem = 1GB
shared_buffer = 1GB
max_parallel_workers_per_gather=4 (warm cache ensured)
The median of 3 runs (reading are quite stable).

On Head: 2702568.099 ms
With Patch: 547363.164 ms

Other Experiments:

* I have also verified reading on the head, without modifying
random_page_cost=seq_page_cost, but there is no change in plan or
execution time.

* I have tried to increase the max_parallel_workers_per_gather to 8
but I did not observe further scaling.


Dilip Kumar

Attachment Content-Type Size
3_head.out application/octet-stream 3.9 KB
3_patch.out application/octet-stream 3.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-12-21 04:41:57 Re: pgstattuple documentation clarification
Previous Message Amit Kapila 2016-12-21 04:32:54 Re: Hang in pldebugger after git commit : 98a64d0