Re: Parallel Aggregates for string_agg and array_agg

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Aggregates for string_agg and array_agg
Date: 2024-03-26 14:00:00
Message-ID: d4493a28-589a-5328-fed5-250f2d7d3e2a@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello David,

23.01.2023 07:37, David Rowley wrote:
> I've now pushed this.
>

I've discovered that the test query:
-- Ensure parallel aggregation is actually being used.
explain (costs off) select * from v_pagg_test order by y;

added by 16fd03e95 fails sometimes. For instance:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=urutu&dt=2024-03-19%2021%3A04%3A05

--- /home/bf/bf-build/urutu/HEAD/pgsql/src/test/regress/expected/aggregates.out 2024-02-24 06:42:47.039073180 +0000
+++ /home/bf/bf-build/urutu/HEAD/pgsql.build/src/test/regress/results/aggregates.out 2024-03-19 22:24:18.155876135 +0000
@@ -1993,14 +1993,16 @@
          Sort Key: pagg_test.y, (((unnest(regexp_split_to_array((string_agg((pagg_test.x)::text, ','::text)),
','::text))))::integer)
          ->  Result
                ->  ProjectSet
-                     ->  Finalize HashAggregate
+                     ->  Finalize GroupAggregate
...

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=phycodurus&dt=2024-02-28%2007%3A38%3A27

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=avocet&dt=2024-02-08%2008%3A47%3A45

I suspect that these failures caused by autovacuum.

I could reproduce this plan change when running multiple tests in
parallel, and also with the attached test patch applied (several sleeps are
needed for autovacuum/relation_needs_vacanalyze() to get a non-zero
mod_since_analyze value from pgstat):
TEMP_CONFIG=/tmp/temp.config TESTS="test_setup create_index create_aggregate aggregates" make -s check-tests

where /tmp/temp.config is:
autovacuum_naptime = 1
log_autovacuum_min_duration = 0

log_min_messages = INFO
log_min_error_statement = log
log_statement = 'all'

With EXPLAIN (VERBOSE), I see a slight change of the Seq Scan cost/rows
estimate:
... ->  Parallel Seq Scan on public.pagg_test  (cost=0.00..48.99 rows=2599 width=8)
vs
.. ->  Parallel Seq Scan on public.pagg_test  (cost=0.00..48.00 rows=2500 width=8)
(after automatic analyze of table "regression.public.pagg_test")

Best regards,
Alexander

Attachment Content-Type Size
aggregates.patch text/x-patch 2.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kartyshov Ivan 2024-03-26 14:06:51 Re: [HACKERS] make async slave to wait for lsn to be replayed
Previous Message Robert Haas 2024-03-26 13:43:58 Re: Possibility to disable `ALTER SYSTEM`