Instability of pg_walsummary/002_blocks.pl due to timing

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Instability of pg_walsummary/002_blocks.pl due to timing
Date: 2025-07-06 09:00:00
Message-ID: f35ba3db-fca7-4693-bc35-6db64488e4b1@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello hackers,

A couple of the 002.blocks test's failures occurred during past three
months: [1], [2] with the following diagnostics:
#   Failed test 'WAL summarizer generates statistics for WAL reads'
#   at /home/bf/bf-build/culicidae/REL_18_STABLE/pgsql/src/bin/pg_walsummary/t/002_blocks.pl line 54.
#          got: 'f'
#     expected: 't'
# Looks like you failed 1 test of 8.

pgsql.build/testrun/pg_walsummary/002_blocks/log/regress_log_002_blocks
[12:29:12.131](0.351s) ok 1 - WAL summarization caught up after insert
[12:29:12.196](0.065s) not ok 2 - WAL summarizer generates statistics for WAL reads
[12:29:12.198](0.002s) #   Failed test 'WAL summarizer generates statistics for WAL reads'
#   at /home/bf/bf-build/culicidae/REL_18_STABLE/pgsql/src/bin/pg_walsummary/t/002_blocks.pl line 54.
[12:29:12.198](0.000s) #          got: 'f'
#     expected: 't'
[12:29:12.267](0.069s) # after insert, summarized through 0/1821510
[12:29:12.507](0.240s) ok 3 - got new WAL summary after update

This test case is rather new, it was added by f4694e0f3 (from 2025-03-05).

I could reproduce this failure within 20 test runs with the following
modification:
--- a/src/backend/postmaster/walsummarizer.c
+++ b/src/backend/postmaster/walsummarizer.c
@@ -1544,6 +1544,7 @@ summarizer_read_local_xlog_page(XLogReaderState *state,
                                 * so we don't tight-loop.
                                 */
                                ProcessWalSummarizerInterrupts();
+pg_usleep(1000000);
                                summarizer_wait_for_wal();

Michael, as you added the test case, could you please have a look?

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tamandua&dt=2025-04-09%2007%3A36%3A05
[2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2025-07-01%2010%3A23%3A38

Best regards,
Alexander

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2025-07-06 09:04:46 Re: Proposal to allow DELETE/UPDATE on partitioned tables with unsupported foreign partitions
Previous Message Etsuro Fujita 2025-07-06 08:29:45 Re: Avoid possible dereference null pointer (contrib/postgres_fdw/postgres_fdw.c)