Re: PoC: Grouped base relation

From: Antonin Houska <ah(at)cybertec(dot)at>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PoC: Grouped base relation
Date: 2017-01-19 11:55:29
Message-ID: 24177.1484826929@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Antonin Houska <ah(at)cybertec(dot)at> wrote:

Well, the following one does not seem to be a typical example. I could
generate the plan, but now I think that the aggregation push down does not in
general decrease the number of groups the final aggregation has to
process. Maybe I just hit planner limitation to estimate the number of groups
within append relation.

> For this query
>
> SELECT
> p.id, sum(price)
> FROM
> products AS p
> JOIN sales AS s ON s.product_id = p.id
> GROUP BY
> p.id
>
> I get this plan at "normal circumstances"
>
> HashAggregate
> Group Key: p.id
> -> Hash Join
> Hash Cond: (s.product_id = p.id)
> -> Gather
> Workers Planned: 2
> -> Append
> -> Parallel Seq Scan on sales s
> -> Parallel Seq Scan on sales_2015 s_1
> -> Parallel Seq Scan on sales_2016 s_2
> -> Parallel Seq Scan on sales_2017 s_3
> -> Hash
> -> Gather
> Workers Planned: 2
> -> Append
> -> Parallel Seq Scan on products p
> -> Parallel Seq Scan on products_01 p_1
> -> Parallel Seq Scan on products_02 p_2
> -> Parallel Seq Scan on products_03 p_3
> -> Parallel Seq Scan on products_04 p_4
>
>
> but if work_mem is sufficiently low for the hash join to be efficient, the
> aggregation can be moved to individual partitions.
>
> Gather
> Workers Planned: 1
> Single Copy: true
> -> Finalize HashAggregate
> Group Key: p.id
> -> Hash Join
> Hash Cond: (p.id = s.product_id)
> -> Append
> -> Partial HashAggregate
> Group Key: p.id
> -> Seq Scan on products p
> -> Partial HashAggregate
> Group Key: p_1.id
> -> Seq Scan on products_01 p_1
> -> Partial HashAggregate
> Group Key: p_2.id
> -> Seq Scan on products_02 p_2
> -> Partial HashAggregate
> Group Key: p_3.id
> -> Seq Scan on products_03 p_3
> -> Partial HashAggregate
> Group Key: p_4.id
> -> Seq Scan on products_04 p_4
> -> Hash
> -> Append
> -> Partial HashAggregate
> Group Key: s.product_id
> -> Seq Scan on sales s
> -> Partial HashAggregate
> Group Key: s_1.product_id
> -> Seq Scan on sales_2015 s_1
> -> Partial HashAggregate
> Group Key: s_2.product_id
> -> Seq Scan on sales_2016 s_2
> -> Partial HashAggregate
> Group Key: s_3.product_id
> -> Seq Scan on sales_2017 s_3

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Rafia Sabih 2017-01-19 12:07:19 Re: Parallel Index-only scan
Previous Message Kuntal Ghosh 2017-01-19 11:53:37 Re: parallelize queries containing subplans