Re: Combining Aggregates

From: David Rowley <dgrowley(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Combining Aggregates
Date: 2015-03-06 07:11:09
Message-ID: CAHoyFK8oR5AMFqUpFWhEo87MCTwRSsvjiKVBgnvq1-0bEM+C2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6 March 2015 at 19:01, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
wrote:

> Postgres-XC solved this question by creating a plan with two Agg/Group
> nodes, one for combining transitioned result and one for creating the
> distributed transition results (one per distributed run per group).
>

> So, Agg/Group for combining result had as many Agg/Group nodes as there
> are distributed/parallel runs.
>

This sounds quite like the planner must be forcing the executor to having
to execute the plan on a fixed number of worker processes.

I really hoped that we could, one day, have a load monitor process that
decided what might be the best number of threads to execute a parallel plan
on. Otherwise how would we decide how many worker processes to allocate to
a plan? Surely there must be times where only utilising half of the
processors for a query would be better than trying to use all processors
and having many more context switched to perform.

Probably the harder part about dynamically deciding the number of workers
would be around the costing. Where maybe the plan will execute the fastest
with 32 workers, but if it was only given 2 workers then it might execute
better as a non-parallel plan.

> But XC chose this way to reduce the code footprint. In Postgres, we can
> have different nodes for combining and transitioning as you have specified
> above. Aggregation is not pathified in current planner, hence XC took the
> approach of pushing the Agg nodes down the plan tree when there was
> distributed/parallel execution possible. If we can get aggregation
> pathified, we can go by path-based approach which might give a better
> judgement of whether or not to distribute the aggregates itself.
>
> Looking at Postgres-XC might be useful to get ideas. I can help you there.
>
>

Regards

David Rowley

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vladimir Borodin 2015-03-06 07:50:27 Re: pg_upgrade and rsync
Previous Message Michael Paquier 2015-03-06 06:39:07 Re: Strange assertion using VACOPT_FREEZE in vacuum.c