Re: Parallel Aggregates for string_agg and array_agg

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Aggregates for string_agg and array_agg
Date: 2018-03-05 03:51:20
Message-ID: CAKJS1f9xEZdTaBbBEKY+TCKEoSZYaH1oSAtsEJHSYDJCCNnQPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5 March 2018 at 04:54, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> 1) There seems to be forgotten declaration of initArrayResultInternal in
> arrayfuncs.c. I suppose you've renamed it to initArrayResultWithSize and
> moved it to a header file, and forgotten to remove this bit.

Oops. Must be left over from trying things another way. Removed.

> 2) bytea_string_agg_deserialize has this comment:
>
> /*
> * data: technically we could reuse the buf.data buffer, but that is
> * slightly larger than we need due to the extra 4 bytes for the cursor
> */
>
> I find the argument "it has 4 extra bytes, so we'll allocate new chunk"
> somewhat weak. We do allocate the memory in 2^n chunks anyway, so the
> new chunk is likely to be much larger anyway. I'm not saying we need to
> reuse the buffer, IMHO the savings will be non-measurable.

Agreed. I've removed that part of the comment.

> 3) string_agg_transfn and bytea_string_agg_transfn say this"
>
> /*
> * We always append the delimiter text before the value text, even
> * with the first aggregated item. The reason for this is that if
> * this state needs to be combined with another state using the
> * aggregate's combine function, then we need to have the delimiter
> * for the first aggregated item. The first delimiter will be
> * stripped off in the final function anyway. We use a little cheat
> * here and store the position of the actual first value (after the
> * delimiter) using the StringInfo's cursor variable. This relies on
> * the cursor not being used for any other purpose.
> */
>
> How does this make the code any simpler, compared to not adding the
> delimiter (as before) and adding it when combining the values (at which
> point you should know when it's actually needed)?

The problem is that if we don't record the first delimiter then we
won't know what it is when it comes to combining the states.

Consider the following slightly backward-looking case;

select string_agg(',', x::text) from generate_Series(1,10)x;
string_agg
----------------------
,2,3,4,5,6,7,8,9,10,

Here the delimiter is the number, not the ','. Of course, the
delimiter for the first aggregated result is not shown. The previous
implementation of the transfn for this aggregate just threw away the
first delimiter, but that's no good for parallel aggregates as the
transfn may be getting called in a parallel worker, in which case
we'll need that delimiter in the combine function to properly join to
the other partial aggregated results matching the same group.

Probably I could improve the comment a bit. I understand that having a
variable delimiter is not commonplace in the normal world. I certainly
had never considered it before working on this. I scratched my head a
bit when doing this and thought about inventing a new trans type, but
I decided that the most efficient design for an aggregate state was to
store the delimiter and data all in one buffer and have a pointer to
the start of the actual data... StringInfo has exactly what's required
if you use the cursor as the pointer, so I didn't go and reinvent the
wheel.

I've now changed the comment to read:

/*
* It's important that we store the delimiter text for all aggregated
* items, even the first one, which at first thought you might think
* could just be discarded. The reason for this is that if this
* function is being called from a parallel worker, then we'll need
* the first delimiter in order to properly combine the partially
* aggregated state with the states coming from other workers. In the
* final output, the first delimiter will be stripped off of the final
* aggregate state, but in order to know where the actual first data
* is we must store the position of the first data value somewhere.
* Conveniently, StringInfo has a cursor property which can be used
* to serve our needs here.
*/

I've also updated an incorrect Assert in array_agg_array_serialize.

Please find attached the updated patch.

Many thanks for reviewing this.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
combinefn_for_string_and_array_aggs_v5.patch application/octet-stream 32.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2018-03-05 04:21:51 Re: non-bulk inserts and tuple routing
Previous Message Justin Pryzby 2018-03-05 03:40:26 Re: Typo in src/backend/access/hash/README