Re: parallel joins, and better parallel explain

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallel joins, and better parallel explain
Date: 2015-12-22 09:14:11
Message-ID: CAFiTN-tPszTZ2wo5y2aKKwBv_U3FJAhr0Ep7-yyG0qoAQuh2Pg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 18, 2015 at 8:47 PM Robert Wrote,

>> Yes, you are right, that create_gather_path() sets parallel_safe to false
>> unconditionally but whenever we are building a non partial path, that
time
>> we should carry forward the parallel_safe state to its parent, and it
seems
>> like that part is missing here..

>Ah, right. Woops. I can't exactly replicate your results, but I've
>attempted to fix this in a systematic way in the new version attached
>here (parallel-join-v3.patch).

I Have tested with the latest patch, problem is solved..

During my testing i observed one more behaviour in the hash join, where
Parallel hash join is taking more time compared to Normal hash join,

Here i have ensured that apart from hash column there is one more condition
on other column which force Random page fetch....

I think this behaviour seems similar what Amit has given in above thread
http://www.postgresql.org/message-id/CAA4eK1+s3uD2G1WSkEAW_FZgp8jeYw3YcnvtuepLihE_e1D7Dw@mail.gmail.com

create table t1 (c1 int, c2 int, c3 text);

create table t2 (c1 int, c2 int, c3 text);

insert into t1 values(generate_series(1,10000000),
generate_series(1,10000000), repeat('x', 1000));

insert into t2 values(generate_series(1,3000000),
generate_series(1,3000000), repeat('x', 5));
analyze t1;
analyze t2;
set max_parallel_degree=6;
postgres=# explain analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1
AND t2.c2 + t1.c1 > 100;
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=474378.39..474378.40 rows=1 width=0) (actual
time=34507.573..34507.573 rows=1 loops=1)
-> Gather (cost=96436.00..471878.39 rows=1000000 width=0) (actual
time=2004.186..33918.216 rows=2999950 loops=1)
Number of Workers: 6
-> Hash Join (cost=95436.00..370878.39 rows=1000000 width=0)
(actual time=2077.085..18651.868 rows=428564 loops=7)
Hash Cond: (t1.c1 = t2.c1)
Join Filter: ((t2.c2 + t1.c1) > 100)
Rows Removed by Join Filter: 7
-> Parallel Seq Scan on t1 (cost=0.00..235164.93
rows=1538462 width=4) (actual time=0.741..13199.231 rows=1428571 loops=7)
-> Hash (cost=46217.00..46217.00 rows=3000000 width=8)
(actual time=2070.827..2070.827 rows=3000000 loops=7)
Buckets: 131072 Batches: 64 Memory Usage: 2861kB
-> Seq Scan on t2 (cost=0.00..46217.00 rows=3000000
width=8) (actual time=0.027..904.607 rows=3000000 loops=7)
Planning time: 0.292 ms
Execution time: 34507.857 ms
(13 rows)

postgres=# set max_parallel_degree=0;
SET
postgres=# explain analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1
AND t2.c2 + t1.c1 > 100;
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1823853.06..1823853.07 rows=1 width=0) (actual
time=17833.067..17833.067 rows=1 loops=1)
-> Hash Join (cost=95436.00..1821353.06 rows=1000000 width=0) (actual
time=1286.788..17558.987 rows=2999950 loops=1)
Hash Cond: (t1.c1 = t2.c1)
Join Filter: ((t2.c2 + t1.c1) > 100)
Rows Removed by Join Filter: 50
-> Seq Scan on t1 (cost=0.00..1528572.04 rows=10000004 width=4)
(actual time=2.728..9881.659 rows=10000000 loops=1)
-> Hash (cost=46217.00..46217.00 rows=3000000 width=8) (actual
time=1279.688..1279.688 rows=3000000 loops=1)
Buckets: 131072 Batches: 64 Memory Usage: 2861kB
-> Seq Scan on t2 (cost=0.00..46217.00 rows=3000000
width=8) (actual time=0.029..588.887 rows=3000000 loops=1)
Planning time: 0.314 ms
Execution time: 17833.143 ms
(11 rows)

Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

On Fri, Dec 18, 2015 at 8:47 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Dec 18, 2015 at 3:54 AM, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
> wrote:
> > On Fri, Dec 18, 2015 at 7.59 AM Robert Haas <robertmhaas(at)gmail(dot)com>
> Wrote,
> >> Uh oh. That's not supposed to happen. A GatherPath is supposed to
> >> have parallel_safe = false, which should prevent the planner from
> >> using it to form new partial paths. Is this with the latest version
> >> of the patch? The plan output suggests that we're somehow reaching
> >> try_partial_hashjoin_path() with inner_path being a GatherPath, but I
> >> don't immediately see how that's possible, because
> >> create_gather_path() sets parallel_safe to false unconditionally, and
> >> hash_inner_and_outer() never sets cheapest_safe_inner to a path unless
> >> that path is parallel_safe.
> >
> > Yes, you are right, that create_gather_path() sets parallel_safe to false
> > unconditionally but whenever we are building a non partial path, that
> time
> > we should carry forward the parallel_safe state to its parent, and it
> seems
> > like that part is missing here..
>
> Ah, right. Woops. I can't exactly replicate your results, but I've
> attempted to fix this in a systematic way in the new version attached
> here (parallel-join-v3.patch).
>
> >> Do you have a self-contained test case that reproduces this, or any
> >> insight as to how it's happening here?
> >
> > This is TPC-H benchmark case:
> > we can setup like this..
> > 1. git clone https://tkejser(at)bitbucket(dot)org/tkejser/tpch-dbgen.git
> > 2. complie using make
> > 3. ./dbgen –v –s 5
> > 4. ./qgen
>
> Thanks. After a bit of fiddling I was able to get this to work. I'm
> attaching two other patches that seem to help this case quite
> considerably. The first (parallel-reader-order-v1) cause Gather to
> read from the same worker repeatedly until it can't get another tuple
> from that worker without blocking, and only then move on to the next
> worker. With 4 workers, this seems to be drastically more efficient
> than what's currently in master - I saw the time for Q5 drop from over
> 17 seconds to about 6 (this was an assert-enabled build running with
> EXPLAIN ANALYZE, though, so take those numbers with a grain of salt).
> The second (gather-disuse-physical-tlist.patch) causes Gather to force
> underlying scan nodes to project, which is a good idea here for
> reasons very similar to why it's a good idea for the existing node
> types that use disuse_physical_tlist: forcing extra data through the
> Gather node is bad. That shaved another half second off this query.
>
> The exact query I was using for testing was:
>
> explain (analyze, verbose) select n_name, sum(l_extendedprice * (1 -
> l_discount)) as revenue from customer, orders, lineitem, supplier,
> nation, region where c_custkey = o_custkey and l_orderkey = o_orderkey
> and l_suppkey = s_suppkey and c_nationkey = s_nationkey and
> s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name =
> 'EUROPE' and o_orderdate >= date '1995-01-01' and o_orderdate < date
> '1995-01-01' + interval '1' year group by n_name order by revenue
> desc;
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2015-12-22 09:36:52 Re: A Typo in regress/sql/privileges.sql
Previous Message Pavel Stehule 2015-12-22 09:09:39 Re: Experimental evaluation of PostgreSQL's query optimizer