Re: Parallel Seq Scan

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Andres Freund <andres(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-04-01 10:30:40
Message-ID: CAA4eK1+juSxoQqHXtQcwjia5VJ80jTaox4bw16kHJag5L6PnGQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 30, 2015 at 8:35 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Wed, Mar 25, 2015 at 6:27 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> > Apart from that I have moved the Initialization of dsm segement from
> > InitNode phase to ExecFunnel() (on first execution) as per suggestion
> > from Robert. The main idea is that as it creates large shared memory
> > segment, so do the work when it is really required.
>
> So, suppose we have a plan like this:
>
> Append
> -> Funnel
> -> Partial Seq Scan
> -> Funnel
> -> Partial Seq Scan
> (repeated many times)
>
> In earlier versions of this patch, that was chewing up lots of DSM
> segments. But it seems to me, on further reflection, that it should
> never use more than one at a time. The first funnel node should
> initialize its workers and then when it finishes, all those workers
> should get shut down cleanly and the DSM destroyed before the next
> scan is initialized.
>
> Obviously we could do better here: if we put the Funnel on top of the
> Append instead of underneath it, we could avoid shutting down and
> restarting workers for every child node. But even without that, I'm
> hoping it's no longer the case that this uses more than one DSM at a
> time. If that's not the case, we should see if we can't fix that.
>

Currently it doesn't behave you are expecting, it destroys the DSM and
perform clean shutdown of workers (DestroyParallelContext()) at the
time of ExecEndFunnel() which in this case happens when we finish
Execution of AppendNode.

One way to change it is do the clean up for parallel context when we
fetch last tuple from the FunnelNode (into ExecFunnel) as at that point
we are sure that we don't need workers or dsm anymore. Does that
sound reasonable to you?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2015-04-01 11:30:42 Re: Parallel Seq Scan
Previous Message Jehan-Guillaume de Rorthais 2015-04-01 10:00:51 Re: Maximum number of WAL files in the pg_xlog directory