Re: pg_replication_slot_advance xmin handling when active slot becomes inactive

From: Andres Freund <andres(at)anarazel(dot)de>
To: Dimitri Fontaine <Dimitri(dot)Fontaine(at)microsoft(dot)com>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_replication_slot_advance xmin handling when active slot becomes inactive
Date: 2021-10-06 19:10:52
Message-ID: 20211006191052.5suneis7unzc5rlg@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On 2021-10-06 08:22:08 +0000, Dimitri Fontaine wrote:
> I believe we have found another bug in Postgres when using pg_auto_failover. The details can be seen at https://github.com/citusdata/pg_auto_failover/issues/814 ; and the Postgres warning message to consider is the following:
>
> WARNING: oldest xmin is far in the past
>
> When a replication slot switches from active to inactive, whatever xmin
> value that is registered on the replication slot is then kept.

That's required - otherwise the slot would e.g. stop keeping
hot_standby_feedback across the replication connection breaking.

> It seems to me that we should either document that a replication slot that
> has been active (used in streaming replication) can not be maintained
> through calls to pg_replication_slot_advance later; or better yet that this
> should be made to work, somehow.

You encountered this on a physical slot, by the sound of this? For a logical
slot we cannot just safely change xmin, but
pg_physical_replication_slot_advance() should update it.

I wonder if we optionally should do something similar in
pg_physical_replication_slot_advance(). I.e. read the WAL between the current
position and the "moveto" LSN, and see what xmin should be updated to. If we
see WAL records that would cause conflicts for an older xmin, we can update
xmin to that.

Greetings,

Andres Freund

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Geoghegan 2021-10-06 19:28:20 Re: BUG #17212: pg_amcheck fails on checking temporary relations
Previous Message Peter Geoghegan 2021-10-06 18:55:59 Re: BUG #17212: pg_amcheck fails on checking temporary relations