011_crash_recovery.pl intermittently fails

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: 011_crash_recovery.pl intermittently fails
Date: 2021-03-05 02:50:11
Message-ID: 20210305.115011.558061052471425531.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

I noticed that 011_crash_recovery.pl intermittently (that being said,
one out of three or so on my environment) fails in the second test.

> t/011_crash_recovery.pl .. 2/3
> # Failed test 'new xid after restart is greater'
> # at t/011_crash_recovery.pl line 56.
> # '539'
> # >
> # '539'
>
> # Failed test 'xid is aborted after crash'
> # at t/011_crash_recovery.pl line 60.
> # got: 'committed'
> # expected: 'aborted'
> # Looks like you failed 2 tests of 3.
> t/011_crash_recovery.pl .. Dubious, test returned 2 (wstat 512, 0x200)
> Failed 2/3 subtests
>
> Test Summary Report
> -------------------
> t/011_crash_recovery.pl (Wstat: 512 Tests: 3 Failed: 2)
> Failed tests: 2-3
> Non-zero exit status: 2
> Files=1, Tests=3, 3 wallclock secs ( 0.03 usr 0.01 sys + 1.90 cusr 0.39 csys = 2.33 CPU)
> Result: FAIL

If the server crashed before emitting WAL records for the transaction
just started, the restarted server cannot know the xid is even
started. I'm not sure that is the intention of the test but we must
make sure the WAL to be emitted before crashing. CHECKPOINT ensures
that.

Thoughts? The attached seems to stabilize the test for me.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
stabilize_011_crash_recovery_pl.patch text/x-patch 401 bytes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Nancarrow 2021-03-05 02:54:01 Re: Parallel INSERT (INTO ... SELECT ...)
Previous Message miyake_kouta 2021-03-05 02:26:45 Re: [PATCH] pgbench: Bug fix for the -d option