Re: Parallel Seq Scan

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-04-22 12:48:00
Message-ID: CAA4eK1JLv+2y1AwjhsQPFisKhBF7jWF_Nzirmzyno9uPBRCpGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 30, 2015 at 8:31 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Wed, Mar 18, 2015 at 11:43 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> >> I think I figured out the problem. That fix only helps in the case
> >> where the postmaster noticed the new registration previously but
> >> didn't start the worker, and then later notices the termination.
> >> What's much more likely to happen is that the worker is started and
> >> terminated so quickly that both happen before we create a
> >> RegisteredBgWorker for it. The attached patch fixes that case, too.
> >
> > Patch fixes the problem and now for Rescan, we don't need to Wait
> > for workers to finish.
>
> I realized that there is a problem with this. If an error occurs in
> one of the workers just as we're deciding to kill them all, then the
> error won't be reported. Also, the new code to propagate
> XactLastRecEnd won't work right, either. I think we need to find a
> way to shut down the workers cleanly. The idea generally speaking
> should be:
>
> 1. Tell all of the workers that we want them to shut down gracefully
> without finishing the scan.
>
> 2. Wait for them to exit via WaitForParallelWorkersToFinish().
>
> My first idea about how to implement this is to have the master detach
> all of the tuple queues via a new function TupleQueueFunnelShutdown().
> Then, we should change tqueueReceiveSlot() so that it does not throw
> an error when shm_mq_send() returns SHM_MQ_DETACHED. We could modify
> the receiveSlot method of a DestReceiver to return bool rather than
> void; a "true" value can mean "continue processing" where as a "false"
> value can mean "stop early, just as if we'd reached the end of the
> scan".
>

I have implemented this idea (note that I have to expose a new API
shm_mq_from_handle as TupleQueueFunnel stores shm_mq_handle* and
we sum_mq* to call shm_mq_detach) and apart this I have fixed other
problems reported on this thread:

1. Execution of initPlan by master backend and then pass the
required PARAM_EXEC parameter values to workers.
2. Avoid consuming dsm's by freeing the parallel context after
the last tuple is fetched.
3. Allow execution of Result node in worker backend as that can
be added as a gating filter on top of PartialSeqScan.
4. Merged parallel heap scan descriptor patch

To apply the patch, please follow below sequence:

HEAD Commit-Id: 4d930eee
parallel-mode-v9.patch [1]
assess-parallel-safety-v4.patch [2] (don't forget to run fixpgproc.pl in
the patch)
parallel_seqscan_v14.patch (Attached with this mail)

[1] -
http://www.postgresql.org/message-id/CA+TgmoZfSXZhS6qy4Z0786D7iU_AbhBVPQFwLthpSvGieczqHg@mail.gmail.com
[2] -
http://www.postgresql.org/message-id/CA+TgmobJSuefiPOk6+i9WERUgeAB3ggJv7JxLX+r6S5SYydBRQ@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
parallel_seqscan_v14.patch application/octet-stream 148.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Abhijit Menon-Sen 2015-04-22 13:03:33 Re: a fast bloat measurement tool (was Re: Measuring relation free space)
Previous Message Michael Paquier 2015-04-22 12:46:35 Re: Rounding to even for numeric data type