RE: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Vitaly Davydov' <v(dot)davydov(at)postgrespro(dot)ru>, vignesh C <vignesh21(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "tomas(at)vondra(dot)me" <tomas(at)vondra(dot)me>
Subject: RE: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly
Date: 2025-06-18 07:25:32
Message-ID: OSCPR01MB14966F3D6F2E8B6C982890880F572A@OSCPR01MB14966.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Vitaly,

I've been working on the bug...

> This assert was introduced in the patch. Now, I think, it is a wrong one. Let me
> please explain one of the possible scenarios when it can be triggered. In case
> of physical replication, when walsender receives a standby reply message, it
> calls PhysicalConfirmReceivedLocation function which updates slots' restart_lsn
> from received flush_lsn value. This value may be older than the saved value.

To confirm, can you tell me the theory why the walsender received old LSN?
It is sent by the walreceiver, so is there a case that LogstreamResult.Flush can go backward?
Not sure we can accept the situation.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2025-06-18 07:32:08 Re: Non-reproducible AIO failure
Previous Message Jelte Fennema-Nio 2025-06-18 07:15:07 Re: [PATCH] Add additional extended protocol commands to psql: \parse and \bindx