Re: Improve error reporting in 027_stream_regress test

From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Improve error reporting in 027_stream_regress test
Date: 2025-07-18 08:57:07
Message-ID: CAN55FZ31NqFqqE7yYOVSjvvG57Gio4VFUNh3mj3G3MiTLuYqYA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Thu, 17 Jul 2025 at 03:08, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Wed, Jul 16, 2025 at 02:32:53PM +0300, Nazir Bilal Yavuz wrote:
> > On Wed, 16 Jul 2025 at 04:39, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> >> Hmm. Is that actually useful as we know that the standby has been
> >> stalen down when running the test? Even if we report something, we
> >> could always trim the output, like the standby case.
>
> My fingers and brain have flipped here. I meant likely
> s/stalen/taken/.
>
> > I guess you are right. Also, I tried killing standby while the
> > 027_stream_regress test was running and the test did not fail; instead
> > it continued until timeout. If that is always the case, then it makes
> > sense to report if and only if both primary and standby are alive.
>
> Okay. Understood. I'm pretty sure that somebody around you has also
> run the test and crashed the standby on replay, to note that the
> pattern on HEAD was bad.

Updated, now error reporting happens only if regression tests are
failed and both primary and standby are alive.

> > system_log() logs which command is run. I actually do not want to log
> > the command because it needs to be run after the regression tests and
> > before the 'regression tests pass'. If I log the command it looks like
> > this:
> >
> > # All 228 tests passed.
> > # Running: pg_isready -h /tmp/EXCcSldGZE -p 10493 >/dev/null 2>&1
> > # Running: pg_isready -h /tmp/EXCcSldGZE -p 10494 >/dev/null 2>&1
> > (21.939s) ok 2 - regression tests pass
> > (0.000s) ok 3 - primary is alive after the regression tests
> > (0.000s) ok 4 - standby is alive after the regression tests
> >
> > I think it looks like these pg_isready commands run randomly.
>
> FWIW, I'm usually in favor of a bit more logging, to understand what
> happens in the test script under-the-hood. That's less guessing when
> calling these routines. If I'm outvoted, that's fine.

I don't have a strong opinion on this. My point was 'pg_isready' being
on top of 'regression test pass' and not being on top of
'primary|standby is alive after the regression tests' gives a false
impression to me. However, if you say this is okay; then I can update
it.

> >> We could use that as a routine of its own in Utils.pm? It's not
> >> really something only for diff files, more something that we can use
> >> to trim the contents of a file, which could be the tail of a server
> >> log file as well.. It's always difficult to measure what a "good"
> >> number of lines is, but perhaps we could use an environment variable
> >> to make that configurable rather than enforce a policy of 50 lines
> >> because we think it's good enough?
> >
> > I think this is a good idea. How about a function called file_trim()
> > with three arguments: the file name, a mode ('head' or 'tail'), and
> > the number of lines to trim from that end. This approach may require
> > reading the file more than once, but it is easy to use.
>
> Sounds fine to me. Thanks!

I added that as 0001. I used a shifting method for the 'tail'
direction to not use too much memory. I found that there is
'File::ReadBackwards' in Perl but you need to install it, so I didn't
use it.

Now patch is divided into three:

0001: 'Add trim_file() helper to Utils.pm' -> Which effectively does
nothing, just adds a function to be used for a subsequent patch. This
function accepts 'line_count' as an argument but
'PG_TEST_FILE_TRIM_LINES' environment variable overrides it. Should I
document 'PG_TEST_FILE_TRIM_LINES' somewhere?

0002: 'Improve error reporting in 027_stream_regress test' -> Uses
trim_file() function to improve error reporting by including the head
and tail of regression.diffs directly in the failure message.

0003: 'Check primary and standby are alive after regression tests in
027_stream_regress' -> Add test for checking status of primary and
standby after the regression tests in 027_stream_regress. Also, it
makes error reporting happen only if regression tests are failed and
both primary and standby are alive.

--
Regards,
Nazir Bilal Yavuz
Microsoft

Attachment Content-Type Size
v3-0001-Add-trim_file-helper-to-Utils.pm.patch text/x-patch 2.2 KB
v3-0002-Improve-error-reporting-in-027_stream_regress-tes.patch text/x-patch 2.6 KB
v3-0003-Check-primary-and-standby-are-alive-after-regress.patch text/x-patch 2.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2025-07-18 08:57:23 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Previous Message Dilip Kumar 2025-07-18 08:40:58 Re: Logical Replication of sequences