Re: asynchronous execution

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp
Cc: robertmhaas(at)gmail(dot)com, amitdkhan(dot)pg(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: asynchronous execution
Date: 2017-02-23 11:59:25
Message-ID: 20170223.205925.120810176.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, I totally reorganized the patch set to four pathces on the
current master (9e43e87).

At Wed, 22 Feb 2017 17:39:45 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20170222(dot)173945(dot)262776579(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> Finally, I couldn't see the crash for the (maybe) same case. I
> can guess two reasons for this. One is that a situation where
> node->as_nasyncpending differs from estate->es_num_pending_async,
> but I couldn't find a possibility. Another is a situation in
> postgresIterateForeignScan where the "next owner" reaches eof but
> another waiter is not. I haven't reproduce the situation but
> fixed it for the case. Addition to that I found a bug in
> ExecAsyncAppendResponse. It calls bms_add_member inappropriate
> way.

This found to be wrong. The true problem here was (maybe) that
ExecAsyncRequest can complete a tuple immediately. This causes
multiple calling to ExecAsyncRequest for the same child at
once. (For the case, the processing node is added again to
node->as_needrequest before ExecAsyncRequest returns.)

Using a copy of node->as_needrequest will fix this but it is
uneasy so I changed ExecAsyncRequest not to return a tuple
immediately. Instaed, ExecAsyncEventLoop skips waiting if no node
to wait. The tuple previously "response"'ed in ExecAsyncRequest
is now responsed here.

Addition to that, the current policy of preserving of
es_wait_event_set doesn't seem to work with the async-capable
postgres_fdw. So the current code cleares it at every entering to
ExecAppend. This needs more thoughts.

I measured the performance of async-execution and it was quite
good from the previous version especially for single-connection
environment.

pf0: 4 foreign tables on single connection
non async : (prev) 7928ms -> (this time)7993ms
async : (prev) 6447ms -> (this time)3211ms

pf1: 4 foreign tables on dedicate connection for every table
non async : (prev) 8023ms -> (this time)7953ms
async : (prev) 1876ms -> (this time)1841ms

Boost rate by async execution is 60% for single connectsion and
77% for dedicate connection environment.

> > Mmm, I reproduces it quite easily. A silly bug.
> >
> > Something bad is happening between freeing ExecutorState memory
> > context and resource owner. Perhaps the ExecutorState is freed by
> > resowner (as a part of its anscestors) before the memory for the
> > WaitEventSet is freed. It was careless of me. I'll reconsider it.
>
> The cause was that the WaitEventSet was placed in ExecutorState
> but registered to TopTransactionResourceOwner. I fixed it.

The attached patches are the following.

0001-Allow-wait-event-set-to-be-registered-to-resource-ow.patch
Allows WaitEventSet to released by resource owner.

0002-Asynchronous-execution-framework.patch
Asynchronous execution framework based on Robert's version. All
edits on this is merged.

0003-Make-postgres_fdw-async-capable.patch
Make postgres_fdw async-capable.

0004-Apply-unlikely-to-suggest-synchronous-route-of-ExecA.patch

This can be merged to 0002 but I didn't since the usage of
using these pragmas is arguable.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
0001-Allow-wait-event-set-to-be-registered-to-resource-ow.patch text/x-patch 9.4 KB
0002-Asynchronous-execution-framework.patch text/x-patch 47.1 KB
0003-Make-postgres_fdw-async-capable.patch text/x-patch 50.9 KB
0004-Apply-unlikely-to-suggest-synchronous-route-of-ExecA.patch text/x-patch 1.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2017-02-23 12:10:37 A typo in mcxt.c
Previous Message Masahiko Sawada 2017-02-23 11:45:23 Re: GUC for cleanup indexes threshold.