Re: Adding CI to our tree

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>
Subject: Re: Adding CI to our tree
Date: 2022-01-19 04:54:12
Message-ID: 450972.1642568052@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2022-01-18 21:50:07 -0500, Tom Lane wrote:
>> This actually causes parallel check-world to fail altogether on florican's
>> host, because the initial fsync of the recovered primary takes more than 3
>> minutes when there's conflicting I/O traffic, causing pg_ctl to time out.

> Ugh.

I misspoke there: it's the standby that is performing an fsync'd
checkpoint and timing out, during the test's promote-the-standby
step.

This test attempt revealed another problem too: the standby never
shut down, and thus the calling "make" never quit, until I intervened
manually. I'm not sure why. I see that Cluster::promote uses
system_or_bail() to run "pg_ctl promote" ... could it be that
BAIL_OUT causes the normal script END hooks to not get run?
But it seems like we'd have noticed that long ago.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2022-01-19 05:07:21 Re: Make relfile tombstone files conditional on WAL level
Previous Message Takashi Menjo 2022-01-19 04:41:11 Re: Map WAL segment files on PMEM as WAL buffers