Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"

From: "Anton A(dot) Melnikov" <aamelnikov(at)inbox(dot)ru>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
Date: 2023-03-08 03:43:47
Message-ID: 7bad2d75-56c6-17e5-642a-c52f0c8eb82b@inbox.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Thomas!

On 04.03.2023 00:39, Thomas Munro wrote:
>> It seems a good topic for a separate thread patch. Would you provide a
>> link to the thread you mentioned please?
>
> https://www.postgresql.org/message-id/flat/367d01a7-90bb-9b70-4cda-248e81cc475c%40cosium.com

Thanks! The important words there:
>> But I fail to see how full_page_writes prevents this since it only act on writes

> It ensures the damage is later repaired during WAL replay. Which can only
> happen if the WAL contains the necessary information to do so - the full page
> writes.

Together with the docs about restoring corrupted pg_control in the
pg_resetwal utility description these words made me think about whether
to save the contents of pg_control at the beginning and end of
checkpoint into WAL and teach pg_resetwal to read them? It would be like
a periodic backup of the pg_control to a safe place.
This thought has nothing to do with this patch and this thread, and, in general,
i'm not sure if it has any practical meaning, and whether, on the contrary, it
may lead to some difficulties. If it seems that there is a sense, then it
will be possible to think further about it.

> For example, see subscription/011_generated which failed like this:
>
> 2023-02-16 06:57:25.724 GMT [5736][not initialized] PANIC: could not
> read file "global/pg_control": Permission denied
>
> That was fixed after I changed it to also do locking in xlog.c
> ReadControlFile(), in the version you tested. There must be something
> I don't understand going on, because that cluster wasn't even running
> before: it had just been created by initdb.
>
> But, anyway, I have a new idea that makes that whole problem go away
> (though I'd still like to understand what happened there):

Seems to be it's a race between the first reading of the pg_control in PostmasterMain()
in LocalProcessControlFile(false) and the second one in SubPostmasterMain() here:
/*
* (re-)read control file, as it contains config. The postmaster will
* already have read this, but this process doesn't know about that.
*/
LocalProcessControlFile(false);

which crashes according to the crash log: crashlog-postgres.exe_19a0_2023-02-16_06-57-26-675.txt

> With the "pseudo-advisory lock for Windows" idea from the last email,
> we don't need to bother with file system level locking in many places.
> Just in controldata_utils.c, for FE/BE interlocking (that is, we don't
> need to use that for interlocking of concurrent reads and writes that
> are entirely in the backend, because we already have an LWLock that we
> could use more consistently). Changes:
>
> 1. xlog.c mostly uses no locking
> 2. basebackup.c now acquires ControlFileLock
> 3. only controldata_utils.c uses the new file locking, for FE-vs-BE interlocking
> 4. lock past the end (pseudo-advisory locking for Windows)

Although the changes in 1. contributes to the problem described above again,
but 4. fixes this. And i did not find any other places where ReadControlFile()
can be called in different processes.That's all ok.
Thanks to 4., now it is not necessary to use VSS to copy the pg_control file,
it can be copied in a common way even during the torture test. This is very good.
I really like the idea with LWLock where possible.
In general, i think that these changes make the patch more lightweight and fast.

Also i ran tests for more than a day in stress mode with fsync=off under windows
and Linux and found no problems. Patch-tester also passes without errors.

I would like to move this patch to RFC, since I don’t see any problems
both in the code and in the tests, but the pg_backup_start() subproblem confuses me.
Maybe move it to a separate patch in a distinct thread?
As there are a number of suggestions and questions to discuss such as:

> Note that when we recover from a basebackup or pg_backup_start()
> backup, we use the backup label to find a redo start location in the
> WAL (see read_backup_label()), BUT we still read the copied pg_control
> file (one that might be too "new", so we don't use its redo pointer).
> So it had better not be torn, or the recovery will fail. So, in this
> version I protected that sendFile() with ControlFileLock. But...
>
> Isn't that a bit strange? To go down this path we would also need to
> document the need to copy the control file with the file locked to
> avoid a rare failure, in the pg_backup_start() documentation. That's
> annoying (I don't even know how to do it with easy shell commands;
> maybe we should provide a program that locks and cats the file...?).

Variant with separate utility looks good, with the recommendation
in the doc to use it for the pg_control coping.

Also seems it is possible to make a function available in psql, such as
export_pg_control('dst_path') with the destination path as argument
and call it before pg_backup_stop().
Or pass the pg_control destination path to the pg_backup_stop() as extra argument.
Or save pg_control to a predetermined location during pg_backup_stop() and specify
in the docs that one need to copy it from there. I suppose that we have the right
to ask the user to perform some manual manipulations here like with backup_label.

> Could we make better use of the safe copy that we have in the log?
> Then the pg_backup_start() subproblem would disappear. Conceptually,
> that'd be just like the way we use FPI for data pages copied during a
> backup. I'm not sure about any of that, though, it's just an idea,
> not tested.

Sorry, i didn't understand the question about log. Would you explain me
please what kind of log did you mention and where can i look this
safe copy creation in the code?

With the best wishes,

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2023-03-08 03:46:38 Re: [Proposal] Add foreign-server health checks infrastructure
Previous Message Peter Smith 2023-03-08 03:39:26 Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher