Re: pg_rewind WAL segments deletion pitfall

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: cyberdemn(at)gmail(dot)com
Cc: michael(at)paquier(dot)xyz, torikoshia(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)postgresql(dot)org, bungina(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg_rewind WAL segments deletion pitfall
Date: 2023-08-24 00:45:38
Message-ID: 20230824.094538.2107929879048686192.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

At Wed, 23 Aug 2023 13:44:52 +0200, Alexander Kukushkin <cyberdemn(at)gmail(dot)com> wrote in
> On Tue, 22 Aug 2023 at 07:32, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> > I don't like much this patch. While it takes correctly advantage of
> > the backward record read logic from SimpleXLogPageRead() able to
> > handle correctly timeline jumps, it creates a hidden dependency in the
> > code between the hash table from filemap.c and the page callback.
> > Wouldn't it be simpler to build a list of the segment names using the
> > information from WALOpenSegment and build this list in
> > findLastCheckpoint()?
>
> I think the first version of the patch more or less did that. Not
> necessarily a list, but a hash table of WAL file names that we want to
> keep. But Kyotaro Horiguchi didn't like it and suggested creating entries
> in the filemap.c hash table instead.
> But, I agree, doing it directly from the findLastCheckpoint() makes the
> code easier to understand.
...
> > + /*
> > + * Some entries (WAL segments) already have an action assigned
> > + * (see SimpleXLogPageRead()).
> > + */
> > + if (entry->action == FILE_ACTION_UNDECIDED)
> > + entry->action = decide_file_action(entry);
> >
> > This change makes me a bit uneasy, per se my previous comment with the
> > additional code dependencies.
> >
>
> We can revert to the original approach (see
> v1-0001-pg_rewind-wal-deletion.patch from the very first email) if you like.

On the other hand, that approach brings in another source that
suggests the way that file should be handled. I still think that
entry->action should be the only source. However, it seems I'm in the
minority here. So I'm not tied to that approach.

> > I think that this scenario deserves a test case. If one wants to
> > emulate a delay in WAL archiving, it is possible to set
> > archive_command to a command that we know will fail, for instance.
> >
>
> Yes, I totally agree, it is on our radar, but meanwhile please see the new
> version, just to check if I correctly understood your idea.

Agreed.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2023-08-24 02:26:10 BUG #18068: Insufficient permission unless SUPERUSER
Previous Message Tom Lane 2023-08-23 20:53:12 Re: BUG #18059: Unexpected error 25001 in stored procedure

Browse pgsql-hackers by date

  From Date Subject
Next Message Ian Lawrence Barwick 2023-08-24 01:22:49 pg_stat_get_backend_subxact() and backend IDs?
Previous Message Tatsuo Ishii 2023-08-24 00:15:51 Re: pgbench: allow to exit immediately when any client is aborted