odd buildfarm failure - "pg_ctl: control file appears to be corrupt"

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
Date: 2022-11-23 01:42:24
Message-ID: 20221123014224.xisi44byq3cf5psi@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

My buildfarm animal grassquit just showed an odd failure [1] in REL_11_STABLE:

ok 10 - standby is in recovery
# Running: pg_ctl -D /mnt/resource/bf/build/grassquit/REL_11_STABLE/pgsql.build/src/bin/pg_ctl/tmp_check/t_003_promote_standby2_data/pgdata promote
waiting for server to promote....pg_ctl: control file appears to be corrupt
not ok 11 - pg_ctl promote of standby runs

# Failed test 'pg_ctl promote of standby runs'
# at /mnt/resource/bf/build/grassquit/REL_11_STABLE/pgsql.build/../pgsql/src/test/perl/TestLib.pm line 474.

I didn't find other references to this kind of failure. Nor has the error
re-occurred on grassquit.

I don't immediately see a way for this message to be hit that's not indicating
a bug somewhere. We should be updating the control file in an atomic way and
read it in an atomic way.

The failure has to be happening in wait_for_postmaster_promote(), because the
standby2 is actually successfully promoted.

Greetings,

Andres Freund

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grassquit&dt=2022-11-22%2016%3A33%3A57

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2022-11-23 02:44:49 Re: Logical replication missing information
Previous Message Tom Lane 2022-11-23 01:36:06 Re: ssl tests aren't concurrency safe due to get_free_port()