Re: Parallel Hash take II

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Oleg Golovanov <rentech(at)mail(dot)ru>
Subject: Re: Parallel Hash take II
Date: 2017-11-08 19:12:40
Message-ID: 20171108191240.72q5gtrjqa7jge4w@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

@@ -747,7 +747,7 @@ try_hashjoin_path(PlannerInfo *root,
* never have any output pathkeys, per comments in create_hashjoin_path.
*/
initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
- outer_path, inner_path, extra);
+ outer_path, inner_path, extra, false);

if (add_path_precheck(joinrel,
workspace.startup_cost, workspace.total_cost,
@@ -761,6 +761,7 @@ try_hashjoin_path(PlannerInfo *root,
extra,
outer_path,
inner_path,
+ false, /* parallel_hash */
extra->restrictlist,
required_outer,
hashclauses));
@@ -776,6 +777,10 @@ try_hashjoin_path(PlannerInfo *root,
* try_partial_hashjoin_path
* Consider a partial hashjoin join path; if it appears useful, push it into
* the joinrel's partial_pathlist via add_partial_path().
+ * The outer side is partial. If parallel_hash is true, then the inner path
+ * must be partial and will be run in parallel to create one or more shared
+ * hash tables; otherwise the inner path must be complete and a copy of it
+ * is run in every process to create separate identical private hash tables.
*/

When do we have "or more shared hash tables" rather than one? Are you
thinking about subordinate nodes?

@@ -1839,6 +1846,10 @@ hash_inner_and_outer(PlannerInfo *root,
* able to properly guarantee uniqueness. Similarly, we can't handle
* JOIN_FULL and JOIN_RIGHT, because they can produce false null
* extended rows. Also, the resulting path must not be parameterized.
+ * We should be able to support JOIN_FULL and JOIN_RIGHT for Parallel
+ * Hash, since in that case we're back to a single hash table with a
+ * single set of match bits for each batch, but that will require
+ * figuring out a deadlock-free way to wait for the probe to finish.
*/

s/should be able/would be able/?

index 6a45b68e5df..2d38a5efae8 100644
--- a/src/backend/storage/ipc/barrier.c
+++ b/src/backend/storage/ipc/barrier.c
@@ -451,7 +451,6 @@ BarrierDetachImpl(Barrier *barrier, bool arrive)
release = true;
barrier->arrived = 0;
++barrier->phase;
- Assert(barrier->selected);
barrier->selected = false;
}

Uh, what?

diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 35523bd8065..40a076d976f 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -5821,6 +5821,9 @@ analyze extremely_skewed;
insert into extremely_skewed
select 42 as id, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
from generate_series(1, 19000);
+-- Make a relation with a couple of enormous tuples.
+create table wide as select generate_series(1, 2) as id, rpad('', 320000, 'x') as t;
+alter table wide set (parallel_workers = 2);

I'm doubtful this is actually going to be a wide tuple - this'll get
compressed down quite a bit, no?

postgres[26465][1]=# SELECT octet_length(t), pg_column_size(t) FROM wide ;
┌──────────────┬────────────────┐
│ octet_length │ pg_column_size │
├──────────────┼────────────────┤
│ 320000 │ 3671 │
│ 320000 │ 3671 │
└──────────────┴────────────────┘
(2 rows)

(and yes, it's ridiculous that a compressed datum of that size still
takes up 3kb)

+-- parallel with parallel-aware hash join
+set max_parallel_workers_per_gather = 2;
+set work_mem = '128kB';
+set enable_parallel_hash = on;

I think it'd be better if we structured the file so we just sat guc's
with SET LOCAL inside a transaction.

+-- parallel with parallel-aware hash join
+set max_parallel_workers_per_gather = 2;
+set work_mem = '64kB';
+set enable_parallel_hash = on;
+explain (costs off)
+ select count(*) from simple r join extremely_skewed s using (id);
+ QUERY PLAN
+-----------------------------------------------------------------------
+ Finalize Aggregate
+ -> Gather
+ Workers Planned: 2
+ -> Partial Aggregate
+ -> Parallel Hash Join
+ Hash Cond: (r.id = s.id)
+ -> Parallel Seq Scan on simple r
+ -> Parallel Hash
+ -> Parallel Seq Scan on extremely_skewed s
+(9 rows)
+
+select count(*) from simple r join extremely_skewed s using (id);
+ count
+-------
+ 20000
+(1 row)

As written before, I think it'd be good if we extracted the number of
batches from json output to be sure things are going sensibly.

All-in-all this part looks fairly boring.

- Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Юрий Соколов 2017-11-08 20:44:50 Re: Small improvement to compactify_tuples
Previous Message Oleg Bartunov 2017-11-08 18:46:49 Re: need info about extensibility in other databases