Re: race condition when writing pg_control

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: race condition when writing pg_control
Date: 2024-05-16 16:19:22
Message-ID: CAAKRu_YNGwEYrorQYza_W8tU+=toXRHG8HpyHC-KDbZqA_ZVSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 7, 2020 at 10:49 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Wed, Jun 3, 2020 at 2:03 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> > On Wed, Jun 03, 2020 at 10:56:13AM +1200, Thomas Munro wrote:
> > > Sorry for my radio silence, I got tangled up with a couple of
> > > conferences. I'm planning to look at 0001 and 0002 shortly.
> >
> > Thanks!
>
> I pushed 0001 and 0002, squashed into one commit. I'm not sure about
> 0003. If we're going to do that, wouldn't it be better to just
> acquire the lock in that one extra place in StartupXLOG(), rather than
> introducing the extra parameter?

Today, after committing a3e6c6f, I saw recovery/018_wal_optimize.pl
fail and see this message in the replica log [2].

2024-05-16 15:12:22.821 GMT [5440][not initialized] FATAL: incorrect
checksum in control file

I'm pretty sure it's not related to my commit. So, I was looking for
existing reports of this error message.

It's a long shot, since 0001 and 0002 were already pushed, but this is
the only recent report I could find of "FATAL: incorrect checksum in
control file" in pgsql-hackers or bugs archives.

I do see this thread from 2016 [3] which might be relevant because the
reported bug was also on Windows.

- Melanie

[1] https://cirrus-ci.com/task/4626725689098240
[2] https://api.cirrus-ci.com/v1/artifact/task/4626725689098240/testrun/build/testrun/recovery/018_wal_optimize/log/018_wal_optimize_node_replica.log
[3] https://www.postgresql.org/message-id/flat/CAEepm%3D0hh_Dvd2Q%2BfcjYpkVzSoNX2%2Bf167cYu5nwu%3Dqh5HZhJw%40mail.gmail.com#042e9ec55c782370ab49c3a4ef254f4a

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2024-05-16 16:25:33 Re: broken JIT support on Fedora 40
Previous Message Jelte Fennema-Nio 2024-05-16 16:09:18 Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs