Re: Proposal : Parallel Merge Join

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal : Parallel Merge Join
Date: 2016-12-11 06:07:43
Message-ID: CAFiTN-tuNKejG8Dqjz7KEuoHm9amJUTQryv+ysZLwPgRBt5g6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Dec 11, 2016 at 8:14 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> I noticed that the partially parallelized Merge Join EXPLAIN ANALYZE
> that you attach has a big misestimation:
>
> -> Merge Join (cost=3405052.45..3676948.66 rows=320 width=32)
> (actual time=21165.849..37814.551 rows=1357812 loops=4)
>
> Is this the best test case to show off the patch? This node is the
> immediate outer child of a Nested Loop Semi Join, and so I'm concerned
> that we measuring the wrong thing.

Actually main purpose of showing this example is that, by enabling
parallelism with merge join we can enable the parallelism in complete
tree, and we can get completely different kind of plan which is way
cheaper.

But I completely agree with your point that, in this particular
example estimation is wrong and may be with correct estimation plan
can be entirely different. I am planning to test this with more self
generated test case where first we can guarantee that our estimation
is almost correct and see the impact of the patch.

Here is one such example, this time I tested my patch with Amit's
parallel index scan patch [1], just to enable more scope for parallel
merge join.

All configurations and machines are same what I mentioned up thread.

create table test1(a, b) as select generate_series(1,
10000000),generate_series(1, 10000000);
create table test2(a, b) as select generate_series(1,
10000000),generate_series(1, 10000000);
create index idx1 on test1(a);
create index idx2 on test2(a);

Query:
explain analyze select * from test1, test2 where test1.a=test2.a and
test1.b = test2.b and test1.a+test2.b < 3 and test1.a < 3000000 and
test2.a < 3000000;

On Head:

Merge Join (cost=6.92..228073.90 rows=1 width=16) (actual
time=0.033..3123.400 rows=1 loops=1)
Merge Cond: (test1.a = test2.a)
Join Filter: ((test1.b = test2.b) AND ((test1.a + test2.b) < 3))
Rows Removed by Join Filter: 2999998
-> Index Scan using idx1 on test1 (cost=0.43..98626.90
rows=2998198 width=8) (actual time=0.013..746.382 rows=2999999
loops=1)
Index Cond: (a < 3000000)
-> Index Scan using idx2 on test2 (cost=0.43..98708.62
rows=3000639 width=8) (actual time=0.012..751.558 rows=2999999
loops=1)
Index Cond: (a < 3000000)
Planning time: 0.604 ms
Execution time: 3123.442 ms

With Patch:

Gather (cost=1005.46..193001.64 rows=1 width=16) (actual
time=0.244..1883.087 rows=1 loops=1)
Workers Planned: 3
Workers Launched: 3
-> Merge Join (cost=5.46..192000.64 rows=1 width=16) (actual
time=1409.686..1880.394 rows=0 loops=4)
Merge Cond: (test2.a = test1.a)
Join Filter: ((test1.b = test2.b) AND ((test1.a + test2.b) < 3))
Rows Removed by Join Filter: 750000
-> Parallel Index Scan using idx2 on test2
(cost=0.43..78381.71 rows=967948 width=8) (actual time=0.023..221.172
rows=750000 loops=4)
Index Cond: (a < 3000000)
-> Index Scan using idx1 on test1 (cost=0.43..98626.90
rows=2998198 width=8) (actual time=0.024..893.081 rows=2999254
loops=4)
Index Cond: (a < 3000000)
Planning time: 0.281 ms
Execution time: 1888.750 ms

This example shows direct improvement with merge join, but IMHO the
main value of this patch is that we can parallelize multi level join
query, where parallelism is not used becuase of merge join at some
level.

[1] https://www.postgresql.org/message-id/CAA4eK1KA4LwTYXZG%3Dk3J2bA%2BZOEYnz2gkyWBKgY%3D_q0xJRBMDw%40mail.gmail.com

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2016-12-11 06:12:32 Re: proposal: psql statements \gstore \gstore_binary (instead COPY RAW)
Previous Message Karl O. Pinc 2016-12-11 03:38:49 Re: Patch to implement pg_current_logfile() function