Re: Parallel Aggregate

From: James Sewell <james(dot)sewell(at)lisasoft(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Aggregate
Date: 2016-03-14 01:16:44
Message-ID: CANkGpBtAYbXtB9GTvRQqLMdeMKTtJEU9mhdAwmjY-X47OE2Naw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I've done some testing with one of my data sets in an 8VPU virtual
environment and this is looking really, really good.

My test query is:

SELECT pageview, sum(pageview_count)
FROM fact_agg_2015_12
GROUP BY date_trunc('DAY'::text, pageview);

The query returns 15 rows. The fact_agg table is 5398MB and holds around 25
million records.

Explain with a max_parallel_degree of 8 tells me that the query will only
use 6 background workers. I have no indexes on the table currently.

Finalize HashAggregate (cost=810142.42..810882.62 rows=59216 width=16)
Group Key: (date_trunc('DAY'::text, pageview))
-> Gather (cost=765878.46..808069.86 rows=414512 width=16)
Number of Workers: 6
-> Partial HashAggregate (cost=764878.46..765618.66 rows=59216
width=16)
Group Key: date_trunc('DAY'::text, pageview)
-> Parallel Seq Scan on fact_agg_2015_12
(cost=0.00..743769.76 rows=4221741 width=12)

I am getting the following timings (everything was cached before I started
tested). I didn't average the runtime, but I ran each one three times and
took the middle value.

*max_parallel_degree runtime*
0 11693.537 ms
1 6387.937 ms
2 4328.629 ms
3 3292.376 ms
4 2743.148 ms
5 2278.449 ms
6 2000.599 ms

I'm pretty happy!

Cheers,

James Sewell,
PostgreSQL Team Lead / Solutions Architect
______________________________________

Level 2, 50 Queen St, Melbourne VIC 3000

*P *(+61) 3 8370 8000 *W* www.lisasoft.com *F *(+61) 3 8370 8099

On Mon, Mar 14, 2016 at 8:44 AM, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
wrote:

> On 12 March 2016 at 16:31, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
> wrote:
> > I've attached an updated patch which is based on commit 7087166,
> > things are really changing fast in the grouping path area at the
> > moment, but hopefully the dust is starting to settle now.
>
> The attached patch fixes a harmless compiler warning about a possible
> uninitialised variable.
>
> --
> David Rowley http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

--

------------------------------
The contents of this email are confidential and may be subject to legal or
professional privilege and copyright. No representation is made that this
email is free of viruses or other defects. If you have received this
communication in error, you may not copy or distribute any part of it or
otherwise disclose its contents to anyone. Please advise the sender of your
incorrect receipt of this correspondence.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2016-03-14 01:30:35 Re: Parallel Aggregate
Previous Message Tomas Vondra 2016-03-14 01:00:03 Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system