Re: pg_rewind WAL segments deletion pitfall

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: bungina(at)gmail(dot)com
Cc: cyberdemn(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg_rewind WAL segments deletion pitfall
Date: 2022-09-27 07:50:54
Message-ID: 20220927.165054.2142431385277288474.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

At Thu, 1 Sep 2022 13:33:09 +0200, Polina Bungina <bungina(at)gmail(dot)com> wrote in
> Here is the new version of the patch that includes the changes you
> suggested. It is smaller now but I doubt if it is as easy to understand as
> it used to be.

pg_rewind works in two steps. First it constructs file map which
decides the action for each file, then second, it performs file
operations according to the file map. So, if we are going to do
something on some files, that action should be record that in the file
map, I think.

Regarding the the patch, pg_rewind starts reading segments from the
divergence point back to the nearest checkpoint, then moves foward
during rewinding. So, the fact that SimpleXLogPageRead have read a
segment suggests that the segment is required during the next startup.
So I don't think we need to move around the keepWalSeg flag. All
files that are wanted while rewinding should be preserved
unconditionally.

It's annoying that the file path for file map and open(2) have
different top directory. But sharing the same path string between the
two seems rather ugly..

I feel uncomfortable to directly touch the internal of file_entry_t
outside filemap.c. I'd like to hide the internals in filemap.c, but
pg_rewind already does that..

+ /*
+ * Some entries (WAL segments) already have an action assigned
+ * (see SimpleXLogPageRead()).
+ */
+ if (entry->action == FILE_ACTION_NONE)
+ continue;
entry->action = decide_file_action(entry);

It might be more reasonable to call decide_file_action() when action
is UNDECIDED.

> The need of manipulations with the target’s pg_wal/archive_status directory
> is a question to discuss…
>
> At first glance it seems to be useless for .ready files: checkpointer
> process will anyway recreate them if archiving is enabled on the rewound
> old primary and we will finally have them in the archive. As for the .done
> files, it seems reasonable to follow the pg_basebackup logic and keep .done
> files together with the corresponding segments (those between the last
> common checkpoint and the point of divergence) to protect them from being
> archived once again.
>
> But on the other hand it seems to be not that straightforward: imaging we
> have WAL segment X on the target along with X.done file and we decide to
> preserve them both (or we download it from archive and force .done file
> creation), while archive_mode was set to ‘always’ and the source (promoted
> replica) also still has WAL segment X and X.ready file. After pg_rewind we
> will end up with both X.ready and X.done, which seems to be not a good
> situation (but most likely not critical either).

Thanks for the thought. Yes, it's not so straight-forward. And, as you
mentioned, the worst result comes from not doing that is that some
already-archived segments are archived at next run, which is generally
harmless. So I think we're ok to ignore that in this patdh then create
other patch if we still want to do that.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2022-09-27 13:56:57 Re: BUG #17385: "RESET transaction_isolation" inside serializable transaction causes Assert at the transaction end
Previous Message Michael Paquier 2022-09-27 03:39:28 Re: BUG #17385: "RESET transaction_isolation" inside serializable transaction causes Assert at the transaction end

Browse pgsql-hackers by date

  From Date Subject
Next Message Maxim Orlov 2022-09-27 08:04:00 Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.
Previous Message John Naylor 2022-09-27 07:41:02 Re: [RFC] building postgres with meson - v13