Re: using custom scan nodes to prototype parallel sequential scan

From: David Rowley <dgrowley(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: using custom scan nodes to prototype parallel sequential scan
Date: 2014-11-14 09:02:49
Message-ID: CAHoyFK9Hhsx3v9mr=BDf4wBaz5rmvrgBVLwbyx86A7NYBBtzvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 14 November 2014 20:37, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:

> On 11/12/14, 1:54 AM, David Rowley wrote:
>
>>
>> We'd also need to add some infrastructure to merge aggregate states
>> together for this to work properly. This means that could also work for
>> avg() and stddev etc. For max() and min() the merge functions would likely
>> just be the same as the transition functions.
>>
>
> Sanity check: what % of a large aggregate query fed by a seqscan actually
> spent in the aggregate functions? Even if you look strictly at CPU cost,
> isn't there more code involved to get data to the aggregate function than
> in the aggregation itself, except maybe for numeric?
>
>
You might be right, but that sounds like it would need all the parallel
workers to send each matching tuple to a queue to be processed by some
aggregate node. I guess this would have more advantage for wider tables or
tables with many dead tuples, or if the query has quite a selective where
clause, as less data would make it onto that queue.

Perhaps I've taken 1 step too far forward here. I had been thinking that
each worker would perform the partial seqscan and in the worker context
pass the tuple down to the aggregate node. Then later once each worker had
complete some other perhaps new node type (MergeAggregateStates) would
merge all those intermediate agg states into the final agg state (which
would then be ready for the final function to be called).

Are there any plans for what will be in charge of deciding how many workers
would be allocated to a parallel query? Will this be something that's done
at planning time? Or should the planner just create a parallel friendly
plan, iif the plan is costly enough and then just allow the executor decide
how many workers to throw at the job based on how busy the system is with
other tasks at execution time?

Regards

David Rowley

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2014-11-14 09:30:27 Re: using custom scan nodes to prototype parallel sequential scan
Previous Message Kyotaro HORIGUCHI 2014-11-14 08:39:33 Re: alter user/role CURRENT_USER