Re: [DESIGN] ParallelAppend

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "Kyotaro HORIGUCHI" <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: [DESIGN] ParallelAppend
Date: 2015-08-25 00:49:12
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8011375CE@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Fri, Aug 21, 2015 at 7:40 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> > On Tue, Aug 18, 2015 at 11:27 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >> Here is one other issue I found. Existing code assumes a TOC segment has
> > >> only one contents per node type, so it uses pre-defined key (like
> > >> PARALLEL_KEY_SCAN) per node type, however, it is problematic if we put
> > >> multiple PlannedStmt or PartialSeqScan node on a TOC segment.
> > >
> > > We have few keys in parallel-seq-scan patch
> > > (PARALLEL_KEY_TUPLE_QUEUE and PARALLEL_KEY_INST_INFO) for
> > > which multiple structures are shared between master and worker backends.
> > >
> > > Check if something similar can work for your use case.
> >
> > I think you are possibly missing the point.
>
> It could be possible, but let me summarize what I thought would be required
> for above use case. For Parallel Append, we need to push multiple
> planned statements in contrast to one planned statement as is done for
> current patch and then one or more parallel workers needs to work on each
> planned statement. So if we know in advance how many planned statements
> are we passing down (which we should), then using ParallelWorkerNumber
> (ParallelWorkerNumber % num_planned_statements or some other similar
> way), workers can find the the planned statement on which they need to work
> and similarly information for PartialSeqScan (which currently is parallel heap
> scan descriptor information).
>
My problem is that we have no identifier to point a particular element on
the TOC segment even if PARALLEL_KEY_PLANNEDSTMT or PARALLEL_KEY_SCAN can
have multiple items.
Please assume a situation when ExecPartialSeqScan() has to lookup
a particular item on TOC but multiple PartialSeqScan nodes can exist.

Currently, it does:
pscan = shm_toc_lookup(node->ss.ps.toc, PARALLEL_KEY_SCAN);

However, ExecPartialSeqScan() cannot know which is the index of mine,
or it is not reasonable to pay attention on other node in this level.
Even if PARALLEL_KEY_SCAN has multiple items, PartialSeqScan node also
needs to have identifier.

> > I think KaiGai's correct,
> > and I pointed out the same problem to you before. The parallel key
> > for the Partial Seq Scan needs to be allocated on the fly and carried
> > in the node, or we'll never be able to push multiple things below the
> > funnel.
>
> Okay, immediately I don't see what is the best way to achieve this but
> let us discuss this separately on Parallel Seq Scan thread and let me
> know if you have something specific in your mind. I will also give this
> a more thought.
>
I want to have 'node_id' in the Plan node, then unique identifier is
assigned on the field prior to serialization. It is a property of the
Plan node, so we can reproduce this identifier on the background worker
side using stringToNode(), then ExecPartialSeqScan can pull out a proper
field from the TOC segment by this node_id.
Probably, we can co-exist this structure without big changes.

1. Define PARALLEL_KEY_DYNAMIC_LEAST as a least value that is larger
than any static TOC key (like PARALLEL_KEY_TUPLE_QUEUE).
2. Run plan-tree node walker on InitializeParallelWorkers, just before
nodeToString(), to assign node_id larger than the above label and
with increasing for each node.
3. Use node_id instead of the static PARALLEL_KEY_SCAN on
ExecPartialSeqScan

Even though we need some more trivial fixes are needed, it seems to
me the above approach shall work.
Also, please note that I don't assume only PartialSeqScan want to
have its field on TOC segment, but some CustomScan node also wants
to have its own shared field when co-working under Funnel node.

On the other hand, I think it is too aggressive to complete the
initial work of this patch by the starting day of the next commit
fest, so I think the target is middle of October.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kouhei Kaigai 2015-08-25 01:18:30 Re: Foreign join pushdown vs EvalPlanQual
Previous Message Tom Lane 2015-08-24 22:27:15 Re: PostgreSQL for VAX on NetBSD/OpenBSD