Quick Links

Re: Parallel Aggregate

From:	Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To:	David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Paul Ramsey <pramsey(at)cleverelephant(dot)ca>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Parallel Aggregate
Date:	2016-01-22 04:25:26
Message-ID:	CAJrrPGcGn0HUfuqQ6JRe321kBEdz0dTjF9Fatro2X_kJgRc5cA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Jan 22, 2016 at 7:44 AM, David Rowley
<david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> On 21 January 2016 at 18:26, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com> wrote:
>> Here I attached update parallel aggregate patch on top of recent commits
>> of combine aggregate and parallel join patch. It still lacks of cost comparison
>> code to compare both parallel and normal aggregates.
>
> Thanks for the updated patch.
>
> I'm just starting to look over this now.
>
> # create table t1 as select x from generate_series(1,1000000) x(x);
> # vacuum ANALYZE t1;
> # set max_parallel_degree =8;
> # explain select sum(x) from t1;
> QUERY PLAN
> -------------------------------------------------------------------------
> Aggregate (cost=9633.33..9633.34 rows=1 width=4)
> -> Parallel Seq Scan on t1 (cost=0.00..8591.67 rows=416667 width=4)
> (2 rows)
>
> I'm not quite sure what's happening here yet as I've not ran it
> through my debugger, but how can we have a Parallel Seq Scan without a
> Gather node? It appears to give correct results, so I can only assume
> it's not actually a parallel scan at all.
>
> Let's check:
>
> # select relname,seq_scan from pg_stat_user_tables where relname ='t1';
> relname | seq_scan
> ---------+----------
> t1 | 0
> (1 row)
>
> # explain analyze select sum(x) from t1;
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=9633.33..9633.34 rows=1 width=4) (actual
> time=161.820..161.821 rows=1 loops=1)
> -> Parallel Seq Scan on t1 (cost=0.00..8591.67 rows=416667
> width=4) (actual time=0.051..85.348 rows=1000000 loops=1)
> Planning time: 0.040 ms
> Execution time: 161.861 ms
> (4 rows)
>
> # select relname,seq_scan from pg_stat_user_tables where relname ='t1';
> relname | seq_scan
> ---------+----------
> t1 | 1
> (1 row)
>
> Only 1 scan.
>
>
> # explain analyze select * from t1 where x=1;
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------
> Gather (cost=1000.00..10633.43 rows=1 width=4) (actual
> time=0.231..49.105 rows=1 loops=1)
> Number of Workers: 2
> -> Parallel Seq Scan on t1 (cost=0.00..9633.33 rows=0 width=4)
> (actual time=29.060..45.302 rows=0 loops=3)
> Filter: (x = 1)
> Rows Removed by Filter: 333333
> Planning time: 0.049 ms
> Execution time: 51.438 ms
> (7 rows)
>
> # select relname,seq_scan from pg_stat_user_tables where relname ='t1';
> relname | seq_scan
> ---------+----------
> t1 | 4
> (1 row)
>
> 3 more scans. This one seems to actually be parallel, and makes sense
> based on "Number of Workers: 2"

The problem was the gather path that is generated on partial path list is
not getting added to path list, because of which, there is a mismatch in
sorted path and cheapest_path, so it leads to a wrong plan.

For temporarily, I marked the sorted_path and cheapest_path as same
and it works fine.

> Also looking at the patch:
>
> +bool
> +aggregates_allow_partial(Node *clause)
> +{
>
> In the latest patch that I sent on the combine aggregates thread:
> http://www.postgresql.org/message-id/CAKJS1f_in9J_ru4gPfygCQLUeB3=RzQ3Kg6RnPN-fzzhdDiyvg@mail.gmail.com
> I made it so there's 3 possible return values from this function. As
> your patch stands now, if I create an aggregate function with an
> INTERNAL state with a combine function set, then this patch might try
> to parallel aggregate that and pass around the pointer to the internal
> state in the Tuple going from the worker to the main process, when the
> main process dereferences this pointer we'll get a segmentation
> violation. So I'd say you should maybe use a modified version of my
> latest aggregates_allow_partial() and check for PAT_ANY, and only
> parallelise the aggregate if you get that value. If the use of
> partial aggregate was within a single process then you could be quite
> content with PAT_INTERNAL_ONLY. You'll just need to pull out the logic
> that checks for serial and deserial functions, since that's not in
> yet, and just have it return PAT_INTERNAL_ONLY if INTERNAL aggregates
> are found which have combine functions set.
>

I took the suggested code changes from combine aggregate patch and
changed accordingly.

Along with these changes, I added a float8 combine function to see
how it works under parallel aggregate, it is working fine for float4, but
giving small data mismatch with float8 data type.

postgres=# select avg(f3), avg(f4) from tbl;
avg | avg
------------------+------------------
1.10000002384186 | 100.123449999879
(1 row)

postgres=# set enable_parallelagg = true;
SET
postgres=# select avg(f3), avg(f4) from tbl;
avg | avg
------------------+------------------
1.10000002384186 | 100.123449999918
(1 row)

Column - f3 - float4
Column - f4 - float8

similar problem for all float8 var_pop, var_samp, stddev_pop and stddev_samp
aggregates. Any special care is needed for float8 datatype?

Regards,
Hari Babu
Fujitsu Australia

Attachment	Content-Type	Size
parallelagg_poc_v6.patch	application/octet-stream	37.8 KB
float8_combine_fn_v1.patch	application/octet-stream	15.8 KB

In response to

Re: Parallel Aggregate at 2016-01-21 20:44:42 from David Rowley

Responses

Re: Parallel Aggregate at 2016-01-22 11:13:38 from David Rowley
Re: Parallel Aggregate at 2016-02-07 22:01:36 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Langote	2016-01-22 04:41:54	Re: Declarative partitioning
Previous Message	Tom Lane	2016-01-22 03:28:11	Re: Extracting fields from 'infinity'::TIMESTAMP[TZ]