Re: asynchronous and vectorized execution

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: amitdkhan(dot)pg(at)gmail(dot)com
Cc: robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: asynchronous and vectorized execution
Date: 2016-08-02 07:41:05
Message-ID: 20160802.164105.22186637.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for the comment.

At Mon, 1 Aug 2016 10:44:56 +0530, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote in <CAJ3gD9ek4Y4SGTSuc_pzkGYwLMbrc9QOM7m1D8bj99JNW16o0g(at)mail(dot)gmail(dot)com>
> On 21 July 2016 at 15:20, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp
> > wrote:
>
> >
> > After some consideration, I found that ExecAsyncWaitForNode
> > cannot be reentrant because it means that the control goes into
> > async-unaware nodes while having not-ready nodes, that is
> > inconsistent state. To inhibit such reentering, I allocated node
> > identifiers in depth-first order so that ascendant-descendant
> > relationship can be checked (nested-set model) in simple way and
> > call ExecAsyncConfigureWait only for the descendant nodes of the
> > parameter planstate.
> >
> >
> We have estate->waiting_nodes containing a mix of async-aware and
> non-async-aware nodes. I was thinking, an asynchrony tree would have only
> async-aware nodes, with possible multiple asynchrony sub-trees in a tree.
> Somehow, if we restrict the bubbling up of events only upto the root of the
> asynchrony subtree, do you think we can simplify some of the complexities ?

The current code prohibiting regsitration of nodes outside the
current subtree to avoid the reentring-disaster.

Indeed leaving the "waiting node" mark or something like on every
root node at the first visit will enable the propagation to stop
upto the root of any async-subtree. Neverheless, when an
async-child in an inactive async-root fires, the new tuple is
loaded but is not consumed then the succeeding firing on the same
child leads to a dead-lock (without result queueing). However,
that can be avoided if ExecAsyncConfigureWait doesn't register
nodes in ready state.

On the other hand, any two or more asynchronous nodes can share a
syncronization object. For instance, multiple postgres_fdw scan
node can share one server connection and only one of them can get
into waitable state at once. If no async-child in the current
async subtree is waitable, it must be stuck. So I think it is
crucial for ExecAsyncWaitForNode to force at least one child *in
the current async subtree* to get into waiting state for such
situation. The ascendant-descendant relationship is necessary to
do that anyway.

Since we should have the node-id to detect ascendant-descendant
relationship anyway and finally should restrict async-nodes with
it, activating only descendant node from the first would make the
things rather simple than avoiding possible dead-lock laster as
described above.

# It is implemented as per-subtree waiting-node list but it was
# fragile and too ugly..

> For e.g. ExecAsyncWaitForNode() has become a bit complex seemingly because
> it has to handle non-async-nodes also, and that's the reason I believe you
> have introduced modes such as ASYNCCONF_FORCE_ADD.

As explained above, the ASYNCCONF_FORCE_ADD is not for
non-async-nodes, but for sets of async nodes that share a
synchronization object. We could let ExecAsyncConfigureWait force
acquire async-object from the first, but it in turn causes
possiblly unnecessary transfer of a sync-object among the nodes
sharing it.

I wish the above sentsnces are readable enough, but any questions
are welcome even the meaning of a sentence.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tal Walter 2016-08-02 08:13:08 Wanting to learn about pgsql design decision
Previous Message Alfred Perlstein 2016-08-02 07:21:38 Re: Why we lost Uber as a user