Re: asynchronous execution

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: asynchronous execution
Date: 2016-10-03 21:00:40
Message-ID: CA+TgmoYMoB4OG1W6KZjgRda1J9=Lo1fXpH0YXjzvSEwU5rqhVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 28, 2016 at 12:30 AM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> On 24 September 2016 at 06:39, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Since Kyotaro Horiguchi found that my previous design had a
>> system-wide performance impact due to the ExecProcNode changes, I
>> decided to take a different approach here: I created an async
>> infrastructure where both the requestor and the requestee have to be
>> specifically modified to support parallelism, and then modified Append
>> and ForeignScan to cooperate using the new interface. Hopefully that
>> means that anything other than those two nodes will suffer no
>> performance impact. Of course, it might have other problems....
>
> I see that the reason why you re-designed the asynchronous execution
> implementation is because the earlier implementation showed
> performance degradation in local sequential and local parallel scans.
> But I checked that the ExecProcNode() changes were not that
> significant as to cause the degradation.

I think we need some testing to prove that one way or the other. If
you can do some - say on a plan with multiple nested loop joins with
inner index-scans, which will call ExecProcNode() a lot - that would
be great. I don't think we can just rely on "it doesn't seem like it
should be slower", though - ExecProcNode() is too important a function
for us to guess at what the performance will be.

The thing I'm really worried about with either implementation is what
happens when we start to add asynchronous capability to multiple
nodes. For example, if you imagine a plan like this:

Append
-> Hash Join
-> Foreign Scan
-> Hash
-> Seq Scan
-> Hash Join
-> Foreign Scan
-> Hash
-> Seq Scan

In order for this to run asynchronously, you need not only Append and
Foreign Scan to be async-capable, but also Hash Join. That's true in
either approach. Things are slightly better with the original
approach, but the basic problem is there in both cases. So it seems
we need an approach that will make adding async capability to a node
really cheap, which seems like it might be a problem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-10-03 21:03:12 Re: proposal: psql \setfileref
Previous Message Robert Haas 2016-10-03 20:34:09 Re: Learning to hack Postgres - Keeping track of ctids