Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"

From: "Anton A(dot) Melnikov" <aamelnikov(at)inbox(dot)ru>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
Date: 2023-02-14 15:52:20
Message-ID: 534f1b4c-ad0c-6ba0-ed9a-521f5e240e0c@inbox.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Thomas!

On 14.02.2023 06:38, Anton A. Melnikov wrote:
> Also i did several experiments with fsync=on and found more appropriate behavior:
> The stress test with sizeof(ControlFileData) = 512+8 = 520 bytes failed in a 4,5 hours,
> but the other one with ordinary sizeof(ControlFileData) = 296 not crashed in more than 12 hours.

Nonetheless it crashed after 18 hours:

2023-02-13 18:07:21.476 MSK [7640] LOG: starting PostgreSQL 16devel, compiled by Visual C++ build 1929, 64-bit
2023-02-13 18:07:21.483 MSK [7640] LOG: listening on IPv6 address "::1", port 5432
2023-02-13 18:07:21.483 MSK [7640] LOG: listening on IPv4 address "127.0.0.1", port 5432
2023-02-13 18:07:21.556 MSK [1940] LOG: database system was shut down at 2023-02-13 18:07:12 MSK
2023-02-13 18:07:21.590 MSK [7640] LOG: database system is ready to accept connections
@@@@@@@@@@@@@ sizeof(ControlFileData) = 296
2023-02-13 18:12:21.545 MSK [9532] LOG: checkpoint starting: time
2023-02-13 18:12:21.583 MSK [9532] LOG: checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.003 s, sync=0.009 s, total=0.038 s; sync files=2, longest=0.005 s, average=0.005 s; distance=0 kB, estimate=0 kB; lsn=0/17AC388, redo lsn=0/17AC350
2023-02-14 12:12:21.738 MSK [8676] ERROR: calculated CRC checksum does not match value stored in file
2023-02-14 12:12:21.738 MSK [8676] CONTEXT: SQL statement "SELECT pg_control_system()"
PL/pgSQL function inline_code_block line 1 at PERFORM
2023-02-14 12:12:21.738 MSK [8676] STATEMENT: do $$ begin loop perform pg_control_system(); end loop; end; $$;

So all of the following is incorrect:

> Seems in that case the behavior corresponds to msdn. So if it is possible
> to use fsync() under windows when the GUC fsync is off it maybe a solution
> for this problem. If so there is no need to lock the pg_control file under windows at all.

and cannot be a solution.

Sincerely yours,

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2023-02-14 16:46:24 Re: run pgindent on a regular basis / scripted manner
Previous Message Karina Litskevich 2023-02-14 14:49:27 Possible false valgrind error reports