Re: asynchronous and vectorized execution

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: asynchronous and vectorized execution
Date: 2016-05-11 14:21:37
Message-ID: CA+TgmobwSEndyr669qpyN_u4XhkPo_2C1BAs2ydb22T4niv_aQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 10, 2016 at 8:23 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> c. Modify some nodes (perhaps start with nodeAgg.c) to allow them to
>> process a batch TupleTableSlot. This will require some tight loop to
>> aggregate the entire TupleTableSlot at once before returning.
>> d. Add function in execAmi.c which returns true or false depending on
>> if the node supports batch TupleTableSlots or not.
>> e. At executor startup determine if the entire plan tree supports
>> batch TupleTableSlots, if so enable batch scan mode.
>
> It doesn't really need to be the entire tree. Even if you have a subtree
> (say a parametrized index nested loop join) which doesn't support batch
> mode, you'll likely still see performance benefits by building a batch
> one layer above the non-batch-supporting node.

+1.

I've also wondered about building a new executor node that is sort of
a combination of Nested Loop and Hash Join, but capable of performing
multiple joins in a single operation. (Merge Join is different,
because it's actually matching up the two sides, not just doing
probing once per outer tuple.) So the plan tree would look something
like this:

Multiway Join
-> Seq Scan on driving_table
-> Index Scan on something
-> Index Scan on something_else
-> Hash
-> Seq Scan on other_thing
-> Hash
-> Seq Scan on other_thing_2
-> Index Scan on another_one

With the current structure, every level of the plan tree has its own
TupleTableSlot and we have to project into each new slot. Every level
has to go through ExecProcNode. So it seems to me that this sort of
structure might save quite a few cycles on deep join nests. I haven't
tried it, though.

With batching, things get even better for this sort of thing.
Assuming the joins are all basically semi-joins, either because they
were written that way or because they are probing unique indexes or
whatever, you can fetch a batch of tuples from the driving table, do
the first join for each tuple to create a matching batch of tuples,
and repeat for each join step. Then at the end you project.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2016-05-11 14:23:22 Re: asynchronous and vectorized execution
Previous Message Bruce Momjian 2016-05-11 14:20:18 Academic help for Postgres