Re: asynchronous execution

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: robertmhaas(at)gmail(dot)com
Cc: amitdkhan(dot)pg(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: asynchronous execution
Date: 2016-10-18 01:30:51
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello, this works but ExecAppend gets a bit degradation.

At Mon, 03 Oct 2016 19:46:32 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20161003(dot)194632(dot)204401048(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> > Some notes:
> >
> > - EvalPlanQual rechecks are broken.

This is fixed by adding (restoring) async-cancelation.

> > - EXPLAIN ANALYZE instrumentation is broken.

EXPLAIN ANALYE seems working but async-specific information is
not available yet.

> > - ExecReScanAppend is broken, because the async stuff needs some way
> > of canceling an async request and I didn't invent anything like that
> > yet.

Fixed as EvalPlanQual.

> > - The postgres_fdw changes pretend to be async but aren't actually.
> > It's just a demo of (part of) the interface at this point.

Applied my previous patch with some modification.

> > - The postgres_fdw changes also report all pg-fdw paths as
> > async-capable, but actually the direct-modify ones aren't, so the
> > regression tests fail.

All actions other than scan does vacate_connection() to use a

> > - Errors in the executor can leak the WaitEventSet. Probably we need
> > to modify ResourceOwners to be able to own WaitEventSets.

WaitEventSet itself is not leaked but epoll-fd should be closed
at failure. This seems doable with TRY-CATCHing in
ExecAsyncEventLoop. (not yet)

> > - There are probably other bugs, too.
> >
> > Whee!
> >
> > Note that I've tried to solve the re-entrancy problems by (1) putting
> > all of the event loop's state inside the EState rather than in local
> > variables and (2) having the function that is called to report arrival
> > of a result be thoroughly different than the function that is used to
> > return a tuple to a synchronous caller.
> >
> > Comments welcome, if you're feeling brave enough to look at anything
> > this half-baked.

This doesn't cause reentry since this no longer bubbles up
tupples through async-unaware nodes. This framework passes tuples
through private channels for requestor and requestees.

Anyway, I amended this and made postgres_fdw async and then
finally all regtests passed with minor modifications. The
attached patches are the following.

The patch Robert shown upthread

A small patch to fix complation errors of 0001.

Several modifications on the infrastructure. The details are
shown after the measurement below.

True-async postgres-fdw.

Performance test script suite.

gentblr.sql - creates test tables. - does single test run and - drives and summirize its results.

I measured performance and had the following result.

t0 - SELECT sum(a) FROM <local single table>;
pl - SELECT sum(a) FROM <4 local children>;
pf0 - SELECT sum(a) FROM <4 foreign children on single connection>;
pf1 - SELECT sum(a) FROM <4 foreign children on dedicate connections>;

The result is written as "time<ms> (std dev <ms>)"

t0: 3820.33 ( 1.88)
pl: 1608.59 ( 12.06)
pf0: 7928.29 ( 46.58)
pf1: 8023.16 ( 26.43)

t0: 3806.31 ( 4.49) 0.4% faster (should be error)
pl: 1629.17 ( 0.29) 1.3% slower
pf0: 6447.07 ( 25.19) 18.7% faster
pf1: 1876.80 ( 47.13) 76.6% faster

t0 is not affected since the ExecProcNode stuff has gone.

pl is getting a bit slower. (almost the same to simple seqscan of
the previous patch) This should be a misprediction penalty.

pf0, pf1 are faster as expected.


The below is a summary of modifications made by 0002 and 0003 patch.

execAsync.c, execnodes.h:

- Added include "pgstat.h" to use WAIT_EVENT_ASYNC_WAIT.

- Changed the interface of ExecAsyncRequest to return if a tuple is
immediately available or not.

- Made ExecAsyncConfigureWait to return if it registered at least
one waitevent or not. This is used to know the caller
(ExecAsyncEventWait) has a event to wait (for safety).

If two or more postgres_fdw nodes are sharing one connection,
only one of them can be waited at once. It is a
responsibility to the FDW drivers to ensure at least one wait
event to be added but on failure WaitEventSetWait silently
waits forever.

- There were separate areq->callback_pending and
areq->request_complete but they are altering together so they are
replaced with one state variable areq->state. New enum
AsyncRequestState for areq->state in execnodes.h.


- Return a tuple immediately if ExecAsyncRequest says that a
tuple is available.

- Reduced nest level of for(;;).

nodeForeignscan.[ch], fdwapi.h, execProcnode.c::

- Calling postgresIterateForeignScan can yield tuples in wrong
shape. Call ExecForeignScan instead.

- Changed the interface of AsyncConfigureWait as execAsync.c.

- Added ShutdownForeignScan interface.

createplan.c, ruleutils.c, plannodes.h:

- With the Rebert's change, explain shows somewhat odd plans
where the Output of Append is named after non-parent
child. This does not harm but uneasy. Added index of the
parent in Append.referent to make it reasoable. (But this
looks ugly..). Still children in explain are in different
order from definition. (expected/postgres_fdw.out is edited)


Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
0001-robert-s-2nd-framework.patch text/x-patch 42.9 KB
0002-Fix-some-bugs.patch text/x-patch 16.3 KB
0003-Modify-async-execution-infrastructure.patch text/x-patch 29.7 KB
0004-Make-postgres_fdw-async-capable.patch text/x-patch 43.2 KB
unknown_filename text/plain 3.3 KB
unknown_filename text/plain 449 bytes
unknown_filename text/plain 808 bytes

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Gavin Flower 2016-10-18 01:32:14 Re: emergency outage requiring database restart
Previous Message Michael Paquier 2016-10-18 01:12:51 Re: emergency outage requiring database restart