test_autovacuum/001_parallel_autovacuum is broken

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Daniil Davydov <3danissimo(at)gmail(dot)com>
Subject: test_autovacuum/001_parallel_autovacuum is broken
Date: 2026-04-07 02:23:57
Message-ID: CAA5RZ0s+kZZRMSF4HW7tZ9W2jS1o4B+Fg8dr5a-T6mANX+mdQA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I noticed that the test introduced in parallel autovacuum in 1ff3180ca01 was
very slow, but eventually succeeded. I tracked it down to the point in
the test that is waiting for "parallel autovacuum worker updated cost params".

This portion of the test that is waiting for the cost params to propagate to the
workers is getting stuck on wait_for_autovacuum_complete(). At the time
it's stuck the injection point from the previous test
autovacuum-start-parallel-vacuum
is still active on template1 tables.

datname | query
| wait_event
-----------+-----------------------------------------------------------+----------------------------------
postgres | select datname, query, wait_event from pg_stat_activity ; |
template1 | autovacuum: VACUUM ANALYZE pg_catalog.pg_attribute
| autovacuum-start-parallel-vacuum
|
| AutovacuumMain
|
| LogicalLauncherMain
|
| IoWorkerMain
|
| IoWorkerMain
|
| IoWorkerMain
|
| CheckpointerMain
|
| BgwriterMain
|
| WalWriterMain
(10 rows)

The poll_query_until eventually just times out, but this does not
cause the test to fail.

# test succeeded
----------------------------------- stderr -----------------------------------
# poll_query_until timed out executing this query:
#
# SELECT autovacuum_count > 1 FROM pg_stat_user_tables WHERE
relname = 'test_autovac'
#
# expecting this output:
# t
# last actual query output:
# f
# with stderr:
================

This issue only occurs when I run all tests and not when I run
test_autovacuum in isolation. It makes sense the issue only occurs
for all tests only since autovacuum running runs for template1 and other
tables unrelated to the test.

I run all the tests ( equivelant of check-wold) with:
```
meson test -q --print-errorlogs
```

I think we can remove the second wait_for_autovacuum_complete()
call in the test, as all we really need is to wait_for_log to guarantee
the cost parameters were updated. No need to wait for the autovacuum
to complete.

--- a/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
+++ b/src/test/modules/test_autovacuum/t/001_parallel_autovacuum.pl
@@ -177,8 +177,6 @@ $node->wait_for_log(
qr/parallel autovacuum worker updated cost params:
cost_limit=500, cost_delay=5, cost_page_miss=10, cost_page_dirty=10,
cost_page_hit=10/,
$log_offset);

-wait_for_autovacuum_complete($node, $av_count);
-
# Cleanup
$node->safe_psql(
'postgres', qq{

Regards,

--
Sami Imseih
Amazon Web Services (AWS)

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Xuneng Zhou 2026-04-07 02:28:52 Re: Implement waiting for wal lsn replay: reloaded
Previous Message Amit Langote 2026-04-07 02:12:15 Re: Eliminating SPI / SQL from some RI triggers - take 3