Re: A test for replay of regression tests

From: Andres Freund <andres(at)anarazel(dot)de>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: A test for replay of regression tests
Date: 2022-01-27 23:03:57
Message-ID: 20220127230357.qmcuz265czinmbcm@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-01-27 17:51:52 -0500, Andrew Dunstan wrote:
> (Not actually fairywren, but equivalent) It's hung at
> src/test/recovery/t/009_twophase.pl line 84:
>
>
> $psql_rc = $cur_primary->psql('postgres', "COMMIT PREPARED
> 'xact_009_1'");

That very likely is the socket-shutdown bug that lead to:

commit 64b2c6507e5714b5c688b9c5cc551fbedb7b3b58
Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Date: 2022-01-25 12:17:40 -0500

Revert "graceful shutdown" changes for Windows, in back branches only.

This reverts commits 6051857fc and ed52c3707, but only in the back
branches. Further testing has shown that while those changes do fix
some things, they also break others; in particular, it looks like
walreceivers fail to detect walsender-initiated connection close
reliably if the walsender shuts down this way. We'll keep trying to
improve matters in HEAD, but it now seems unwise to push these changes
into stable releases.

Discussion: https://postgr.es/m/CA+hUKG+OeoETZQ=Qw5Ub5h3tmwQhBmDA=nuNO3KG=zWfUypFAw@mail.gmail.com

If you apply that commit, does the problem go away?

That's why I'd suggested to revert them in
https://postgr.es/m/20220125023609.5ohu3nslxgoygihl%40alap3.anarazel.de

> This is an Amazon EC2 WS2019 instance, of type t3.large i.e. 8Gb of
> memory (not the same machine I reported test times from). Perhaps I need
> to test on another instance. Note though that when I tested with a
> ucrt64 build, including use of the ucrt64 perl/prove, the recovery test
> passed on an equivalent instance, so that's probably another reason to
> switch fairywren to using the ucrt64 environment.

Without the revert I do get through the tests some of the time - imo likely
that the hang isn't related to the specific msys/mingw environment.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-01-27 23:13:37 Re: Replace uses of deprecated Python module distutils.sysconfig
Previous Message Andres Freund 2022-01-27 22:59:26 Re: A test for replay of regression tests