Re: Improve error reporting in 027_stream_regress test

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Improve error reporting in 027_stream_regress test
Date: 2025-07-29 04:00:01
Message-ID: 29b637df-f818-4b52-986a-f11ba28300e9@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Nazir and Michael!

01.07.2025 10:57, Nazir Bilal Yavuz wrote:
> I agree with you. So, the current logic is:
>
> If primary is not alive: Do not report anything.
> If only primary is alive: Report the entire diff file.
> If both primary and standby are alive: Report entire diff file and add
> head+tail of diff to the failure message.
>
> Done like above in v2.

The new check has failed on mamba [1], apparently because this animal is
too slow for pg_isready:

regress_log_027_stream_regress:
...
# Running: pg_isready --host /home/buildfarm/bf-data/tmp/ep8AOH4m7l --port 24781
/home/buildfarm/bf-data/tmp/ep8AOH4m7l:24781 - no response
[08:01:50.899](1505.313s) ok 2 - regression tests pass
[08:01:50.902](0.003s) ok 3 - primary alive after regression test run
[08:01:50.905](0.003s) not ok 4 - standby alive after regression test run
[08:01:50.908](0.003s)
[08:01:50.908](0.000s) #   Failed test 'standby alive after regression test run'
#   at t/027_stream_regress.pl line 104.
[08:01:50.909](0.001s) #          got: '0'
#     expected: '1'

027_stream_regress_standby_1.log:
2025-07-28 07:37:27.237 EDT [22920:1] LOG:  starting PostgreSQL 19devel on powerpc-unknown-netbsd10.1, compiled by cc
(nb3 20231008) 10.5.0, 32-bit
2025-07-28 07:37:27.239 EDT [22920:2] LOG:  listening on Unix socket "/home/buildfarm/bf-data/tmp/ep8AOH4m7l/.s.PGSQL.24781"
2025-07-28 07:37:27.281 EDT [25395:1] LOG:  database system was interrupted; last known up at 2025-07-28 07:36:48 EDT
2025-07-28 07:37:27.282 EDT [25395:2] LOG:  starting backup recovery with redo LSN 0/02000028, checkpoint LSN
0/02000080, on timeline ID 1
2025-07-28 07:37:27.283 EDT [25395:3] LOG:  entering standby mode
2025-07-28 07:37:27.287 EDT [25395:4] LOG:  redo starts at 0/02000028
...
2025-07-28 08:01:47.884 EDT [6985:1] [unknown] LOG:  connection received: host=[local]
2025-07-28 08:01:48.261 EDT [6985:2] [unknown] LOG:  connection authenticated: user="buildfarm" method=trust
(/home/buildfarm/bf-data/HEAD/pgsql.build/src/test/recovery/tmp_check/t_027_stream_regress_standby_1_data/pgdata/pg_hba.conf:117)
2025-07-28 08:01:48.261 EDT [6985:3] [unknown] LOG:  connection authorized: user=buildfarm database=postgres
application_name=027_stream_regress.pl
2025-07-28 08:01:51.552 EDT [6985:4] 027_stream_regress.pl LOG: could not send data to client: Broken pipe

### 3 seconds is the pg_isready's default timeout

2025-07-28 08:01:51.552 EDT [6985:5] 027_stream_regress.pl FATAL: connection to client lost
2025-07-28 08:01:51.552 EDT [6985:6] 027_stream_regress.pl LOG: disconnection: session time: 0:00:03.670 user=buildfarm
database=postgres host=[local]
...

What do you think of increasing the timeout (e.g. , to 10 seconds)?

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2025-07-28%2007%3A46%3A26

Best regards,
Alexander

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2025-07-29 04:01:27 Re: A performance regression issue with Memoize
Previous Message Lukas Fittl 2025-07-29 03:59:35 Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment