Re: Reviving lost replication slots

From: sirisha chamarthi <sirichamarthi22(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Reviving lost replication slots
Date: 2022-11-10 10:37:52
Message-ID: CAKrAKeXwUOTEVNtw=Jt6ZmFSJsNcFrYvtOhy0oJj_tGefLVB3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 9, 2022 at 2:37 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Fri, Nov 4, 2022 at 1:40 PM sirisha chamarthi
> <sirichamarthi22(at)gmail(dot)com> wrote:
> >
> Is the intent of setting restart_lsn to InvalidXLogRecPtr was to
> disallow reviving the slot?
> >
>
> I think the intent is to compute the correct value for
> replicationSlotMinLSN as we use restart_lsn for it and using the
> invalidated slot's restart_lsn value for it doesn't make sense.
>

Correct. If a slot is invalidated (lost), then shouldn't we ignore the
slot from computing the catalog_xmin? I don't see it being set to
InvalidTransactionId in ReplicationSlotsComputeRequiredXmin. Attached a
small patch to address this and the output after the patch is as shown
below.

postgres=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary |
active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phase
-----------+---------------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+-----------
s2 | test_decoding | logical | 5 | postgres | f | f
| | | 771 | 0/30466368 | 0/304663A0
| reserved | 28903824 | f
(1 row)

postgres=# create table t2(c int, c1 char(100));
CREATE TABLE
postgres=# drop table t2;
DROP TABLE
postgres=# vacuum pg_class;
VACUUM
postgres=# select n_dead_tup from pg_stat_all_tables where relname =
'pg_class';
n_dead_tup
------------
2
(1 row)

postgres=# select * from pg_stat_replication;
pid | usesysid | usename | application_name | client_addr |
client_hostname | client_port | backend_start | backend_xmin | state |
sent_lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag |
replay_lag | sync_pri
ority | sync_state | reply_time
-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+--------------+-------+----------+-----------+-----------+------------+-----------+-----------+------------+---------
------+------------+------------
(0 rows)

postgres=# insert into t1 select * from t1;
INSERT 0 2097152
postgres=# checkpoint;
CHECKPOINT
postgres=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary |
active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phase
-----------+---------------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+-----------
s2 | test_decoding | logical | 5 | postgres | f | f
| | | 771 | | 0/304663A0
| lost | | f
(1 row)

postgres=# vacuum pg_class;
VACUUM
postgres=# select n_dead_tup from pg_stat_all_tables where relname =
'pg_class';
n_dead_tup
------------
0
(1 row)

>
> --
> With Regards,
> Amit Kapila.
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message sirisha chamarthi 2022-11-10 10:42:50 Re: Reviving lost replication slots
Previous Message Amit Kapila 2022-11-10 10:34:40 Assertion failure in SnapBuildInitialSnapshot()