Re: Parallel Aggregates for string_agg and array_agg

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Aggregates for string_agg and array_agg
Date: 2018-04-05 03:41:30
Message-ID: CAKJS1f_yDsOrSffHBqAN3Gx27R1vW6OHaKy_EwnhnLZJmo8+Mw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tomas,

Thanks for taking another look.

On 5 April 2018 at 07:12, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> Seems fine to me, although we should handle the anyarray case too, I
> guess. That is, get_agg_clause_costs_walker() should do this too:
>
> /* Same thing for array_agg_array_(de)serialize. */
> if ((aggserialfn == F_ARRAY_AGG_ARRAY_SERIALIZE ||
> aggdeserialfn == F_ARRAY_AGG_ARRAY_DESERIALIZE) &&
> !agg_args_support_sendreceive(aggref))
> costs->hasNonSerial = true;

hmm, array_agg_array_serialize and array_agg_array_deserialize don't
use the send/receive functions though, so not sure why that's
required?

> Other than that, the patch seems fine to me, and it's already marked as
> RFC so I'll leave it at that.

Thanks.

> The last obstacle seems to be the argument about the risks of the patch
> breaking queries of people relying on the ordering. Not sure what's are
> the right next steps in this regard ...

yeah, seems like a bit of a stalemate.

Personally, think if the group of people Tom mentions do exist, then
they've already been through some troubled times since Parallel Query
was born. I'd hope they've already put up their defenses due to the
advent of that feature. I stand by my thoughts that it's pretty
bizarre to draw the line here when we've probably been causing these
people issues for many years already. I said my piece on this already
so likely not much point in going on about it. These people are also
perfectly capable of sidestepping the whole issue with setting
max_parallel_workers_per_gather TO 0.

Perhaps one solution is to drop the string_agg and keep the array_agg.
Unsure if that would be good enough for Tom? More people seem to have
voiced that array_agg ordering is generally less of a concern, which I
imagine is probably true, but my opinion might not matter here as I'm
the sort of person who if I needed a specific ordering I'd have
written an ORDER BY clause.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2018-04-05 04:04:22 Unstable number of workers in select_parallel test on spurfowl
Previous Message Amit Langote 2018-04-05 03:14:03 Re: [HACKERS] Runtime Partition Pruning