Re: Concurrency issue in pg_rewind

From: Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru>
To: Alexander Kukushkin <cyberdemn(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Concurrency issue in pg_rewind
Date: 2020-09-17 12:04:12
Message-ID: 3fc84939f5023a57e9b9b128a84c85d1@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020-09-16 15:55, Alexander Kukushkin wrote:
> Hello,
>
> Today I bumped into an issue with pg_rewind which is not 100% clear
> where should be better fixed.
>
> The first call of pg_rewind failed with the following message:
> servers diverged at WAL location A76/39E55338 on timeline 132
> could not open file
> "/home/postgres/pgdata/pgroot/data/pg_wal/0000008400000A760000001E":
> No such file or directory
>
> could not find previous WAL record at A76/1EFFE620
> Failure, exiting
>
> In order to avoid rebuilding the replica from scratch, we restored the
> missing file by calling restore_command (wal-g) and repeated the call
> of pg_rewind.
>
> The second time pg_rewind also failed, but the error looked
> differently:
> servers diverged at WAL location A76/39E55338 on timeline 132
> rewinding from last common checkpoint at A76/1EF254B8 on timeline 132
>
> could not remove file
> "/home/postgres/pgdata/pgroot/data/pg_wal/.wal-g/prefetch/running/0000008400000A7600000024":
> No such file or directory
> Failure, exiting
>
> The second call left PGDATA in an inconsistent state (empty
> pg_control).
>
> A few words about where the pg_wal/.wal-g/prefetch/running/ is coming
> from: wal-g by default when fetching the WAL file is also trying to do
> a prefetch of a few next WAL files. For that it forks and the child
> process doing prefetch while the parent process exits. In order to
> avoid multiple parallel prefetches of the same file, wal-g keeps its
> state in the pg_wal/.wal-g directory. It also keeps prefetched files
> there.
>

Hm, I cannot understand why wal-g (or any other tool) is trying to run
pg_rewind, while WAL copying (and prefetch) is still in progress? Why do
not just wait until it is finished?

It is also not clear for me why it does not put prefetched WAL files
directly into the pg_wal?

>
> The issue occurred on 10.14, but I believe very similar might happen
> with postgres 13 when pg_rewind is called with --restore-target-wal
> option.
>

With --restore-target-wal pg_rewind is trying to call restore_command on
its own and it can happen at two stages:

1) When pg_rewind is trying to find the last checkpoint preceding a
divergence point. In that case file map is not even yet initialized.
Thus, all fetched WAL segments at this stage will be present in the file
map created later.

2) When it creates a data pages map. It should traverse WAL from the
last common checkpoint till the final shutdown point in order to find
all modified pages on the target. At this stage pg_rewind only updates
info about data segments in the file map. That way, I see a minor
problem that WAL segments fetched at this stage would not be deleted,
since they are absent in the file map.

Anyway, pg_rewind does not delete neither WAL segments, not any other
files in the middle of the file map creation, so I cannot imagine, how
it can get into the same trouble on its own.

>
> That made me think about how it could be improved in the pg_rewind.
> The thing is, that we want to have a specific file to be removed, and
> it is already not there. Should it be a fatal error?
> traverse_datadir()/recurse_dir() already ignoring all failed lstat()
> calls with errno == ENOENT.
>

recurse_dir() is also called on the source, which should be running when
--source-server is used, so it can remove files. There is actually a
TODO there:

/*
* File doesn't exist anymore. This is ok, if the new master
* is running and the file was just removed. If it was a data
* file, there should be a WAL record of the removal. If it
* was something else, it couldn't have been anyway.
*
* TODO: But complain if we're processing the target dir!
*/

>
> Basically I have to options:
> 1. Remove modify remove_target_file(const char *path, bool missing_ok)
> function and remove the missing_ok option, that would be consistent
> with recurse_dir()
> 2. Change the logic of remove_target_file(), so it doesn't exit with
> the fatal error if the file is missing, but shows only a warning.
>
> In addition to the aforementioned options the remove_target_dir() also
> should be improved, i.e. it should check errno and behave similarly to
> the remove_target_file() if the errno == ENOENT
>
> What do you think?
>

Although keeping arbitrary files inside PGDATA does not look like a good
idea for me, I do not see anything criminal in skipping non-existing
file, when executing a file map by pg_rewind.

Regards
--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2020-09-17 12:11:40 Re: Yet another fast GiST build
Previous Message Dmitry Dolgov 2020-09-17 11:01:03 Re: [HACKERS] [PATCH] Generic type subscripting