From: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | ah(at)cybertec(dot)at |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: asynchronous execution |
Date: | 2017-07-18 07:24:52 |
Message-ID: | 20170718.162452.221576658.horiguchi.kyotaro@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
At Tue, 11 Jul 2017 10:28:51 +0200, Antonin Houska <ah(at)cybertec(dot)at> wrote in <6448(dot)1499761731(at)localhost>
> Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > Effectively it is a waiting-queue followed by a
> > completed-list. The point of the compaction is keeping the order
> > of waiting or not-yet-completed requests, which is crucial to
> > avoid kind-a precedence inversion. We cannot keep the order by
> > using bitmapset in such way.
>
> > The current code waits all waiters at once and processes all
> > fired events at once. The order in the waiting-queue is
> > inessential in the case. On the other hand I suppoese waiting on
> > several-tens to near-hundred remote hosts is in a realistic
> > target range. Keeping the order could be crucial if we process a
> > part of the queue at once in the case.
> >
> > Putting siginificance on the deviation of response time of
> > remotes, process-all-at-once is effective. In turn we should
> > consider the effectiveness of the lifecycle of the larger wait
> > event set.
>
> ok, I missed the fact that the order of es_pending_async entries is
> important. I think this is worth adding a comment.
I'll put an upper limit to the number of waiters processed at
once. Then add a comment like that.
> Actually the reason I thought of simplification was that I noticed small
> inefficiency in the way you do the compaction. In particular, I think it's not
> always necessary to swap the tail and head entries. Would something like this
> make sense?
I'm not sure, but I suppose that it is rare that all of the first
many elements in the array are not COMPLETE. In most cases the
first element gets a response first.
>
> /* If any node completed, compact the array. */
> if (any_node_done)
> {
...
> for (tidx = 0; tidx < estate->es_num_pending_async; ++tidx)
> {
...
> if (tail->state == ASYNCREQ_COMPLETE)
> continue;
>
> /*
> * If the array starts with one or more incomplete requests,
> * both head and tail point at the same item, so there's no
> * point in swapping.
> */
> if (tidx > hidx)
> {
This works to skip the first several elements when all of them
are ASYNCREQ_COMPLETE. I think it makes sense as long as it
doesn't harm the loop. The optimization is more effective by
putting out of the loop like this.
| for (tidx = 0; tidx < estate->es_num_pending_async &&
estate->es_pending_async[tidx] == ASYNCREQ_COMPLETE; ++tidx);
| for (; tidx < estate->es_num_pending_async; ++tidx)
...
> And besides that, I think it'd be more intuitive if the meaning of "head" and
> "tail" was reversed: if the array is iterated from lower to higher positions,
> then I'd consider head to be at higher position, not tail.
Yeah, but maybe the "head" is still confusing even if reversed
because it is still not a head of something. It might be less
confusing by rewriting it in more verbose-but-straightforwad way.
| int npending = 0;
|
| /* Skip over not-completed items at the beginning */
| while (npending < estate->es_num_pending_async &&
| estate->es_pending_async[npending] != ASYNCREQ_COMPLETE)
| npending++;
|
| /* Scan over the rest for not-completed items */
| for (i = npending + 1 ; i < estate->es_num_pending_async; ++i)
| {
| PendingAsyncRequest *tmp;
| PendingAsyncRequest *curr = estate->es_pending_async[i];
|
| if (curr->state == ASYNCREQ_COMPLETE)
| continue;
|
| /* Move the not-completed item to the tail of the first chunk */
| tmp = estate->es_pending_async[i];
| estate->es_pending_async[nepending] = tmp;
| estate->es_pending_async[i] = tmp;
| ++npending;
| }
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro HORIGUCHI | 2017-07-18 09:12:13 | Re: PgFDW connection invalidation by ALTER SERVER/ALTER USER MAPPING |
Previous Message | Etsuro Fujita | 2017-07-18 07:20:40 | Re: Oddity in error handling of constraint violation in ExecConstraints for partitioned tables |