Re: Parallel Seq Scan

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, Jeff Davis <pgsql(at)j-davis(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-09-18 19:43:31
Message-ID: CA+TgmoZemL-jo=2vGJN0NeH00fkURGewWxO3D33BTdWEhoOTCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 18, 2015 at 12:56 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Sep 17, 2015 at 11:44 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> Okay, but I think the same can be achieved with this as well. Basic idea
>> is that each worker will work on one planned statement at a time and in
>> above case there will be two different planned statements and they will
>> store partial seq scan related information in two different loctions in
>> toc, although the key (PARALLEL_KEY_SCAN) would be same and I think this
>> will quite similar to what we are already doing for response queues.
>> The worker will work on one of those keys based on planned statement
>> which it chooses to execute. I have explained this in somewhat more details
>> in one of my previous mails [1].
>
> shm_toc keys are supposed to be unique. If you added more than one
> with the same key, there would be no look up the second one. That was
> intentional, and I don't want to revise it.
>
> I don't want to have multiple PlannedStmt objects in any case. That
> doesn't seem like the right approach. I think passing down an Append
> tree with multiple Partial Seq Scan children to be run in order is
> simple and clear, and I don't see why we would do it any other way.
> The master should be able to generate a plan and then copy the part of
> it below the Funnel and send it to the worker. But there's clearly
> never more than one PlannedStmt in the master, so where would the
> other ones come from in the worker? There's no reason to introduce
> that complexity.

Also, as KaiGai pointed out on the other thread, even if you DID pass
two PlannedStmt nodes to the worker, you still need to know which one
goes with which ParallelHeapScanDesc. If both of the
ParallelHeapScanDesc nodes are stored under the same key, then you
can't do that. That's why, as discussed in the other thread, we need
some way of uniquely identifying a plan node.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Verite 2015-09-18 20:02:39 Re: [patch] Proposal for \rotate in psql
Previous Message Fabien COELHO 2015-09-18 18:35:48 Re: extend pgbench expressions with functions