stats test intermittent failure

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: stats test intermittent failure
Date: 2023-07-10 18:35:11
Message-ID: CAAKRu_bNG27AxG9TdPtwsL6wg8AWbVckjmTL2t1HF=miDQuNtw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Jeff pointed out that one of the pg_stat_io tests has failed a few times
over the past months (here on morepork [1] and more recently here on
francolin [2]).

Failing test diff for those who prefer not to scroll:

+++ /home/bf/bf-build/francolin/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/stats.out
2023-07-07 18:48:25.976313231 +0000
@@ -1415,7 +1415,7 @@
:io_sum_vac_strategy_after_reuses > :io_sum_vac_strategy_before_reuses;
?column? | ?column?
----------+----------
- t | t
+ t | f

My theory about the test failure is that, when there is enough demand
for shared buffers, the flapping test fails because it expects buffer
access strategy *reuses* and concurrent queries already flushed those
buffers before they could be reused. Attached is a patch which I think
will fix the test while keeping some code coverage. If we count
evictions and reuses together, those should have increased.

- Melanie

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=morepork&dt=2023-06-16%2018%3A30%3A32
[2] https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=francolin&dt=2023-07-07%2018%3A43%3A57&stg=recovery-check

Attachment Content-Type Size
v1-0001-Fix-pg_stat_io-buffer-reuse-test-instability.patch text/x-patch 6.5 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sergei Kornilov 2023-07-10 18:39:41 Re: pg_column_toast_chunk_id: a function to get a chunk ID of a TOASTed value
Previous Message Nathan Bossart 2023-07-10 18:12:37 Re: add non-option reordering to in-tree getopt_long