Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "tomas(at)vondra(dot)me" <tomas(at)vondra(dot)me>
Subject: Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly
Date: 2025-06-17 20:18:30
Message-ID: 870747.1750191510@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

vignesh C <vignesh21(at)gmail(dot)com> writes:
> While tracking buildfarm for one of other commits, I noticed this failure:
> TRAP: failed Assert("s->data.restart_lsn >=
> s->last_saved_restart_lsn"), File:
> "../pgsql/src/backend/replication/slot.c", Line: 1813, PID: 3945797

My animal mamba is also showing this assertion failure, but in a
different test (recovery/t/040_standby_failover_slots_sync.pl).
It's failed in two out of its three runs since ca307d5ce went in,
so it's more reproducible than scorpion's report, though still not
perfectly so.

I suspect that mamba is prone to this simply because it's slow,
although perhaps there's a different reason. Anyway, happy to
investigate manually if there's something you'd like me to
check for.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-06-17 20:19:21 Re: minimum Meson version
Previous Message Greg Sabino Mullane 2025-06-17 20:09:19 Re: minimum Meson version