parallel aggregation for PostgreSQL 9.5

From: PostgreSQL - Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>
To: PostgreSQL Announce <pgsql-announce(at)postgresql(dot)org>
Subject: parallel aggregation for PostgreSQL 9.5
Date: 2015-12-01 17:47:45
Message-ID: 0B9D5042-8862-413F-AE1E-C1B2CBF0528A@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-announce

agg-1.0: Bringing multi-core to PostgreSQL aggregations
===========================================

Cybertec Schönig & Schönig GmbH (http://www.cybertec.at/) is proud to announce
the first version of "agg", which brings multi-core analytics to PostgreSQL 9.5.
"app" can be loaded as an extension and is able to scale out aggregations to
more than just one CPU core speeding up queries significantly.

Tests have shown that queries on a 40 core box are up to 30 times faster than on
a single server. "agg" is pushing the limits of PostgreSQL even futher and
provides a signigicant milestone for analytical workloads.

An example:
==========

To show the true potential of agg we have compiled some benchmarking data:

Test case:

- 40 CPU cores (Intel)
- 200 million rows

Running a single-core test:

agg=# SET agg.hash_workers = 1;
SET
agg=# SELECT val1, count(*) FROM t_agg GROUP BY 1 LIMIT 2;
val1 | count
------+---------
34 | 5000000
25 | 5000000
(2 rows)

Time: 55701.966 ms

With 10 cores agg can achieve a perfectly linear improvement

agg=# SET agg.hash_workers = 10;
SET
agg=# SELECT val1, count(*) FROM t_agg GROUP BY 1 LIMIT 2;
val1 | count
------+---------
34 | 5000000
25 | 5000000
(2 rows)

Time: 5574.891 ms

With all CPU cores at work the stunning number of 200 million rows can be
aggregated in roughly 2.6 seconds. On other words: PostgreSQL crunched 77
million rows per second:

agg=# SET agg.hash_workers = 40;
SET
agg=# SELECT val1, count(*) FROM t_agg GROUP BY 1 LIMIT 2;
val1 | count
------+---------
34 | 5000000
25 | 5000000
(2 rows)

Time: 2596.967 ms

Installation:
=========

"agg" can be downloaded freely from our website:

http://www.cybertec.at/en/products/agg-parallel-aggregations-postgresql/
<http://www.cybertec.at/en/products/agg-parallel-aggregations-postgresql/>
It can be loaded into PostgreSQL 9.5 as a simple extension.
The module us completely transparent and does not require changes on the SQL
level.

Features and limitations:
===================

"agg" has been optimized to scale aggregations and sequential scans. It sits
between the optimizer and the executor post-processing a standard PostgreSQL
execution plan.
In case agg discovers a suitable plan, it replaces standard routines with our
multi-core implementations. In case a query is not suitable for parallel
execution, agg just leaves the PostgreSQL plan as is.

Supported features:
- Parallel aggregations
- Support for FILTER
- Procedures containing suitable queries
- Parallel scanning of single tables
- Parallel scanning of partitioned tables

Not supported:
- Parallel joins
- SMP-aware CREATE INDEX, VACUUM, etc.
- Grouping sets

24x7 support:
=============

Cybertec Schönig & Schönig GmbH offers professional 24x7 support, consulting,
and training to professional users deploying PostgreSQL and "agg" in their
environments.

Contact office(at)cybertec(dot)at for futher information.

many thanks,

hans

--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at

Browse pgsql-announce by date

  From Date Subject
Next Message Gilles Darold 2015-12-03 10:09:16 Ora2Pg v16.1 released
Previous Message Dan Langille 2015-12-01 00:54:51 PGCon 2016: 17-21 May