Re: Statistical aggregate functions are not working with partitionwise aggregate

From: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
To: Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Statistical aggregate functions are not working with partitionwise aggregate
Date: 2019-05-03 11:56:45
Message-ID: CAM2+6=XobXLhtF_qQQJAGdzVHwmSihYwM356294eVDnCqB56+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 3, 2019 at 2:56 PM Rajkumar Raghuwanshi <
rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com> wrote:

> Hi,
>
> On PG-head, Some of statistical aggregate function are not giving correct
> output when enable partitionwise aggregate while same is working on v11.
>

I had a quick look over this and observed that something broken with the
PARTIAL aggregation.

I can reproduce same issue with the larger dataset which results into
parallel scan.

CREATE TABLE tbl1(a int2,b float4) partition by range(a);
create table tbl1_p1 partition of tbl1 for values from (minvalue) to (0);
create table tbl1_p2 partition of tbl1 for values from (0) to (maxvalue);
insert into tbl1 select i%2, i from generate_series(1, 1000000) i;

# SELECT regr_count(b, a) FROM tbl1;
regr_count
------------
0
(1 row)

postgres:5432 [120536]=# explain SELECT regr_count(b, a) FROM tbl1;
QUERY
PLAN
------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=15418.08..15418.09 rows=1 width=8)
-> Gather (cost=15417.87..15418.08 rows=2 width=8)
Workers Planned: 2
-> Partial Aggregate (cost=14417.87..14417.88 rows=1 width=8)
-> Parallel Append (cost=0.00..11091.62 rows=443500
width=6)
-> Parallel Seq Scan on tbl1_p2 (cost=0.00..8850.00
rows=442500 width=6)
-> Parallel Seq Scan on tbl1_p1 (cost=0.00..24.12
rows=1412 width=6)
(7 rows)

postgres:5432 [120536]=# set max_parallel_workers_per_gather to 0;
SET
postgres:5432 [120536]=# SELECT regr_count(b, a) FROM tbl1;
regr_count
------------
1000000
(1 row)

After looking further, it seems that it got broken by following commit:

commit a9c35cf85ca1ff72f16f0f10d7ddee6e582b62b8
Author: Andres Freund <andres(at)anarazel(dot)de>
Date: Sat Jan 26 14:17:52 2019 -0800

Change function call information to be variable length.

This commit is too big to understand and thus could not get into the excact
cause.

Thanks

> below are some of examples.
>
> CREATE TABLE tbl(a int2,b float4) partition by range(a);
> create table tbl_p1 partition of tbl for values from (minvalue) to (0);
> create table tbl_p2 partition of tbl for values from (0) to (maxvalue);
> insert into tbl values (-1,-1),(0,0),(1,1),(2,2);
>
> --when partitionwise aggregate is off
> postgres=# SELECT regr_count(b, a) FROM tbl;
> regr_count
> ------------
> 4
> (1 row)
> postgres=# SELECT regr_avgx(b, a), regr_avgy(b, a) FROM tbl;
> regr_avgx | regr_avgy
> -----------+-----------
> 0.5 | 0.5
> (1 row)
> postgres=# SELECT corr(b, a) FROM tbl;
> corr
> ------
> 1
> (1 row)
>
> --when partitionwise aggregate is on
> postgres=# SET enable_partitionwise_aggregate = true;
> SET
> postgres=# SELECT regr_count(b, a) FROM tbl;
> regr_count
> ------------
> 0
> (1 row)
> postgres=# SELECT regr_avgx(b, a), regr_avgy(b, a) FROM tbl;
> regr_avgx | regr_avgy
> -----------+-----------
> |
> (1 row)
> postgres=# SELECT corr(b, a) FROM tbl;
> corr
> ------
>
> (1 row)
>
> Thanks & Regards,
> Rajkumar Raghuwanshi
> QMG, EnterpriseDB Corporation
>

--
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rafia Sabih 2019-05-03 12:55:47 Re: make \d pg_toast.foo show its indices
Previous Message Rajkumar Raghuwanshi 2019-05-03 09:26:04 Statistical aggregate functions are not working with partitionwise aggregate