Re: [DESIGN] ParallelAppend

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: [DESIGN] ParallelAppend
Date: 2015-07-28 04:24:06
Message-ID: CAA4eK1+nRK+ANwETsUw2_N17g_+AuiAqJX8oMQ5jhE+7AAiwiw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 28, 2015 at 7:59 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
>
> > -----Original Message-----
> > From: pgsql-hackers-owner(at)postgresql(dot)org
> > [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Kouhei Kaigai
> > Sent: Monday, July 27, 2015 11:07 PM
> > To: Amit Kapila
> > >
> > > Is there a real need to have new node like ParallelAppendPath?
> > > Can't we have Funnel node beneath AppendNode and then each
> > > worker will be responsible to have SeqScan on each inherited child
> > > relation. Something like
> > >
> > > Append
> > > ---> Funnel
> > > --> SeqScan rel1
> > > --> SeqScan rel2
> > >
> > If Funnel can handle both of horizontal and vertical parallelism,
> > it is a great simplification. I never stick a new node.
> >
> > Once Funnel get a capability to have multiple child nodes, probably,
> > Append node above will have gone. I expect set_append_rel_pathlist()
> > add two paths based on Append and Funnel, then planner will choose
> > the cheaper one according to its cost.
> >
> In the latest v16 patch, Funnel is declared as follows:
>
> typedef struct Funnel
> {
> Scan scan;
> int num_workers;
> } Funnel;
>
> If we try to add Append capability here, I expects the structure will
> be adjusted as follows, for example:
>
> typedef struct Funnel
> {
> Scan scan;
> List *funnel_plans;
> List *funnel_num_workers;
> } Funnel;
>
> As literal, funnel_plans saves underlying Plan nodes instead of the
> lefttree. Also, funnel_num_workers saves number of expected workers
> to be assigned on individual child plans.
>

or shall we have a node like above and name it as FunnelAppend or
AppenFunnel?

> Even though create_parallelscan_paths() in v16 set num_workers not
> larger than parallel_seqscan_degree, total number of the concurrent
> background workers may exceed this configuration if more than two
> PartialSeqScan nodes are underlying.
> It is a different configuration from max_worker_processes, so it is
> not a matter as long as we have another restriction.
> However, how do we control the cap of number of worker processes per
> "appendable" Funnel node? For example, if a parent table has 200
> child tables but max_worker_processes are configured to 50.
> It is obviously impossible to launch all the background workers
> simultaneously. One idea I have is to suspend launch of some plans
> until earlier ones are completed.
>

Okay, but I think in that idea you need to re-launch the workers again for
new set of relation scan's which could turn out to be costly, how about
designing some way where workers after completing their assigned work
check for new set of task/'s (which in this case would be to scan a new) and
then execute the same. I think in this way we can achieve dynamic
allocation
of work and achieve maximum parallelism with available set of workers.
We have achieved this in ParallelSeqScan by scanning at block level, once
a worker finishes a block, it checks for new block to scan.

>
> > We will need to pay attention another issues we will look at when Funnel
> > kicks background worker towards asymmetric relations.
> >
> > If number of rows of individual child nodes are various, we may
> > want to assign 10 background workers to scan rel1 with PartialSeqScan.
> > On the other hands, rel2 may have very small number of rows thus
> > its total_cost may be smaller than cost to launch a worker.
> > In this case, Funnel has child nodes to be executed asynchronously and
> > synchronously.
> >

I think this might turn out to be slightly tricky, for example how do we
know
for what size of relation, how many workers are sufficient?
Another way to look at dividing the work in this case could be in terms of
chunk-of-blocks, once a worker finishes it current set of block/'s, it
should be
able to get new set of block's to scan. So let us assume if we decide
chunk-size as 32 and total number of blocks in whole inheritance hierarchy
are 3200, then the max workers we should allocate to this scan are 100 and
if we have parallel_seqscan degree lesser than that then we can use those
many workers and then let them scan 32-blocks-at-a-time.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2015-07-28 05:00:31 Re: Feature - Index support on an lquery field (from the ltree module)
Previous Message Pavel Stehule 2015-07-28 04:08:55 Re: proposal: multiple psql option -c