Re: [DESIGN] ParallelAppend

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: [DESIGN] ParallelAppend
Date: 2015-11-12 12:45:48
Message-ID: 9A28C8860F777E439AA12E8AEA7694F80116F22A@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 2015/11/12 14:09, Kouhei Kaigai wrote:
> > I'm now designing the parallel feature of Append...
> >
> > Here is one challenge. How do we determine whether each sub-plan
> > allows execution in the background worker context?
> >
> > The commit f0661c4e8c44c0ec7acd4ea7c82e85b265447398 added
> > 'parallel_aware' flag on Path and Plan structure.
> > It tells us whether the plan/path-node can perform by multiple
> > background worker processes concurrently, but also tells us
> > nothing whether the plan/path-node are safe to run on background
> > worker process context.
>
> When I was looking at the recent parallelism related commits, I noticed a
> RelOptInfo.consider_parallel flag. That and the function
> set_rel_consider_parallel() may be of interest in this regard.
> set_append_rel_size() passes the parent's state of this flag down to child
> relations but I guess that's not what you're after.
>
Thanks for this information. Indeed, it shall inform us which base
relations are valid for parallel execution.
In case of parallel-append, we can give up parallelism if any of
underlying base relation is not parallel aware. We can use same
logic for join relation cases, potentially.

A challenge is how to count up optimal number of background worker
process. I assume the number of workers of parallel-append shall be
sum of required number of workers by the sub-plans unless it does not
exceed max_parallel_degree.
Probably, we need pathnode_tree_walker() to count up number of workers
required by the sub-plans.

BTW, is the idea of consider_parallel flag in RelOptInfo workable for
join relations also? In case when A LEFT JOIN B for example, it can be
parallel aware if join path has A as outer input, but it cannot be if
B would be outer input.
I doubt that this kind of information belong to Path, not RelOptInfo.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-11-12 12:50:40 Re: psql: add \pset true/false
Previous Message Robert Haas 2015-11-12 12:45:00 Re: CustomScan support on readfuncs.c