From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Suraj Kharage <suraj(dot)kharage(at)enterprisedb(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Replication slot is not able to sync up |
Date: | 2025-05-23 04:55:15 |
Message-ID: | CAA4eK1J2KjBVdKpDUwsuUFKsRma91d8mP06-dADXb--jAZGJKw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, May 23, 2025 at 9:57 AM Suraj Kharage <
suraj(dot)kharage(at)enterprisedb(dot)com> wrote:
> Hi,
>
> Noticed below behaviour where replication slot is not able to sync up if
> any catalog changes happened after the creation.
> Getting below LOG when trying to sync replication slots using
> pg_sync_replication_slots() function.
> The newly created slot does not appear on the standby after this LOG -
>
> 2025-05-23 07:57:12.453 IST [4178805] *LOG: could not synchronize
> replication slot "failover_slot" because remote slot precedes local slot*
> 2025-05-23 07:57:12.453 IST [4178805] *DETAIL: The remote slot has LSN
> 0/B000060 and catalog xmin 764, but the local slot has LSN 0/B000060 and
> catalog xmin 765.*
> 2025-05-23 07:57:12.453 IST [4178805] STATEMENT: SELECT
> pg_sync_replication_slots();
>
> Below is the test case tried on latest master branch -
> =========
> - Create the Primary and start the server.
> wal_level = logical
>
> - Create the physical slot on Primary.
> SELECT pg_create_physical_replication_slot('slot1');
>
> - Setup the standby using pg_basebackup.
> bin/pg_basebackup -D data1 -p 5418 -d "dbname=postgres" -R
>
> primary_slot_name = 'slot1'
> hot_standby_feedback = on
> port = 5419
>
> -- Start the standby.
>
> -- Connect to Primary and create a logical replication slot.
> SELECT pg_create_logical_replication_slot('failover_slot', 'pgoutput',
> false, false, true);
>
> postgres(at)4177929=#select xmin,* from pg_replication_slots ;
> xmin | slot_name | plugin | slot_type | datoid | database |
> temporary | active | active_pid | xmin | catalog_xmin | restart_lsn |
> confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
> e | two_phase_at | inactive_since | conflicting |
> invalidation_reason | failover | synced
>
> ------+---------------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+---------
>
> --+--------------+----------------------------------+-------------+---------------------+----------+--------
> 765 | slot1 | | physical | | | f
> | t | 4177898 | 765 | | 0/B018B00 |
> | reserved | | f
> | | | |
> | f | f
> | failover_slot | pgoutput | logical | 5 | postgres | f
> | f | | | 764 | 0/B000060 | 0/B000098
> | reserved | | f
> | | 2025-05-23 07:55:31.277584+05:30 | f |
> | t | f
> (2 rows)
>
> -- Perform some catalog changes. e.g.:
> create table abc(id int);
> postgres(at)4179034=#select xmin from pg_class where relname='abc';
> xmin
> ------
> 764
> (1 row)
>
> -- Connect to the standby and try to sync the replication slots.
> SELECT pg_sync_replication_slots();
>
> In the logfile, can see below LOG -
> 2025-05-23 07:57:12.453 IST [4178805] LOG: could not synchronize
> replication slot "failover_slot" because remote slot precedes local slot
> 2025-05-23 07:57:12.453 IST [4178805] DETAIL: The remote slot has LSN
> 0/B000060 and catalog xmin 764, but the local slot has LSN 0/B000060 and
> catalog xmin 765.
> 2025-05-23 07:57:12.453 IST [4178805] STATEMENT: SELECT
> pg_sync_replication_slots();
>
> select xmin,* from pg_replication_slots ;
> no rows
>
> Primary -
> postgres(at)4179034=#select xmin,* from pg_replication_slots ;
> xmin | slot_name | plugin | slot_type | datoid | database |
> temporary | active | active_pid | xmin | catalog_xmin | restart_lsn |
> confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
> e | two_phase_at | inactive_since | conflicting |
> invalidation_reason | failover | synced
>
> ------+---------------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+---------
>
> --+--------------+----------------------------------+-------------+---------------------+----------+--------
> 765 | slot1 | | physical | | | f
> | t | 4177898 | 765 | | 0/B018C08 |
> | reserved | | f
> | | | |
> | f | f
> | failover_slot | pgoutput | logical | 5 | postgres | f
> | f | | | 764 | 0/B000060 | 0/B000098
> | reserved | | f
> | | 2025-05-23 07:55:31.277584+05:30 | f |
> | t | f
> (2 rows)
> =========
>
> Is there any way to sync up the replication slot after the catalog changes
> have been made after creation?
>
The remote_slot (slot on primary) should be advanced before you invoke
sync_slot. Can you do pg_logical_slot_get_changes() API before performing
sync? You can check the xmin of the logical slot after get_changes to
ensure that xmin has moved to 765 in your case.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2025-05-23 05:19:05 | Re: doc: Make logical replication examples executable in bulk and legal sgml. |
Previous Message | Suraj Kharage | 2025-05-23 04:26:42 | Replication slot is not able to sync up |