Re: BUG #17903: There is a bug in the KeepLogSeg()

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: xu(dot)xw2008(at)163(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17903: There is a bug in the KeepLogSeg()
Date: 2023-04-20 03:04:17
Message-ID: 20230420.120417.1609083651022565895.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

At Wed, 19 Apr 2023 10:26:13 +0000, PG Bug reporting form <noreply(at)postgresql(dot)org> wrote in
> I found that KeepLogSeg() has a piece of code that is not correctly.
>
> segno may be larger than currSegNo, since the slot_keep_segs variable is of
> type "uint64", in this case the code "if (currSegNo - segno >
> slot_keep_segs)" is incorrect.
>
> "if (currSegNo - segno < keep_segs)" is also the same.
>
> Checkpoint calls the KeepLogSeg function, and there are many operations
> between recptr and XLogGetReplicationSlotMinimumLSN, including updating the
> pg_control file, so segno may be larger than currSegNo.

Correct. Thanks for the report.

If checkpointer somehow takes a long time between inserting a
checkpoint record and removing WAL files, while replication advances a
certain distnace, it can actually happen. Although that behavior
doesn't directly affect max_slot_wal_keep_size, it does disrupt the
effect of wal_keep_size.

The thinko was that we incorrectly assumed the slot minimum LSN can't
be larger than the checkpoint record LSN. We don't need to consider
max_slot_wal_keep_size if the slot minimum LSN is already larger than
currSegNo.

The attached fix works. However, I can't come up with a reasonable
testing script.

This dates back to 13, where max_slot_wal_keep_size was introduced.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
0001-Fix-incorrect-calculation-regarding-max_slot_wal_kee.patch text/x-patch 1.2 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andrey Lepikhov 2023-04-20 05:06:51 Re: Clause accidentally pushed down ( Possible bug in Making Vars outer-join aware)
Previous Message Michael Paquier 2023-04-20 00:15:41 Re: pg_basebackup: errors on macOS on directories with ".DS_Store" files