Re: Parallel grouping sets

From: Pengzhou Tang <ptang(at)pivotal(dot)io>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Richard Guo <guofenglinux(at)gmail(dot)com>, Jesse Zhang <sbjesse(at)gmail(dot)com>, Richard Guo <riguo(at)pivotal(dot)io>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Subject: Re: Parallel grouping sets
Date: 2020-03-20 11:57:02
Message-ID: CAG4reAQRbuLd_fhjyhjHSXFDSeBmCO=rrDM53O8v04dR071uaQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tomas,

I rebased the code and resolved the comments you attached, some unresolved
comments are explained in 0002-fixes.patch, please take a look.

I also make the hash spill working for parallel grouping sets, the plan
looks like:

gpadmin=# explain select g100, g10, sum(g::numeric), count(*), max(g::text)
from gstest_p group by cube (g100,g10);
QUERY PLAN
-------------------------------------------------------------------------------------------
Finalize MixedAggregate (cost=1000.00..7639.95 rows=1111 width=80)
Filtered by: (GROUPINGSETID())
Group Key: ()
Hash Key: g100, g10
Hash Key: g100
Hash Key: g10
Planned Partitions: 4
-> Gather (cost=1000.00..6554.34 rows=7777 width=84)
Workers Planned: 7
-> Partial MixedAggregate (cost=0.00..4776.64 rows=1111 width=84)
Group Key: ()
Hash Key: g100, g10
Hash Key: g100
Hash Key: g10
Planned Partitions: 4
-> Parallel Seq Scan on gstest_p (cost=0.00..1367.71
rows=28571 width=12)
(16 rows)

Thanks,
Pengzhou

On Thu, Mar 19, 2020 at 10:09 AM Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

> Hi,
>
> unfortunately this got a bit broken by the disk-based hash aggregation,
> committed today, and so it needs a rebase. I've started looking at the
> patch before that, and I have it rebased on e00912e11a9e (i.e. the
> commit before the one that breaks it).
>
> Attached is the rebased patch series (now broken), with a couple of
> commits with some minor cosmetic changes I propose to make (easier than
> explaining it on a list, it's mostly about whitespace, comments etc).
> Feel free to reject the changes, it's up to you.
>
> I'll continue doing the review, but it'd be good to have a fully rebased
> version.
>
> regards
>
> --
> Tomas Vondra
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.2ndQuadrant.com&d=DwIBAg&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=L968W84_Yb9HJKtAAZUSYw&m=hYswOh9Appfj1CipZAY8-RyPSLWnua0VLEaMDCJ2L3s&s=iYybgoMynB_mcwDfPDmJv3afu-Xdis45lMkS-_6LGnQ&e=
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Attachment Content-Type Size
0001-All-grouping-sets-do-their-own-sorting.patch application/octet-stream 36.0 KB
0002-fixes.patch application/octet-stream 6.8 KB
0003-fix-a-numtrans-bug.patch application/octet-stream 3.4 KB
0004-Reorganise-the-aggregate-phases.patch application/octet-stream 89.0 KB
0005-Parallel-grouping-sets.patch application/octet-stream 63.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2020-03-20 13:43:20 Re: Berserk Autovacuum (let's save next Mandrill)
Previous Message Magnus Hagander 2020-03-20 11:54:31 Re: Why does [auto-]vacuum delay not report a wait event?