Re: Seeking Clarification on Logical Replication Start LSN

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Pradeep Kumar <spradeepkumar29(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Seeking Clarification on Logical Replication Start LSN
Date: 2024-03-02 11:35:52
Message-ID: CAA4eK1K3vjj7o-x8AbDvrF_O989enWcb_2_aWorBWj1+_dYx5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 27, 2024 at 5:56 PM Pradeep Kumar <spradeepkumar29(at)gmail(dot)com> wrote:
>
> Dear Postgres Community,
>
> I hope this email finds you well. I am reaching out to seek clarification on an issue I am encountering with logical replication in PostgreSQL.
>
> My specific question pertains to determining the appropriate LSN (Log Sequence Number) from which to start logical replication. Allow me to provide detailed context for better understanding:
>
> During the process of performing a parallel pg_basebackup, I concurrently execute DML queries. As part of the pg_basebackup command, I utilize the option create-slot to create a replication slot. Subsequently, upon completion of the base backup, I initiate logical replication using the restart_lsn obtained during the execution of the pg_basebackup command. My intention is to ensure consistency between two database clusters.
>
> However, I am encountering errors during this process. Specifically, I receive the following error message on the source side:
>
> """
> 2024-02-27 16:20:09.271 IST [2838457] ERROR: duplicate key value violates unique constraint "table_15_36_pkey"
> 2024-02-27 16:20:09.271 IST [2838457] DETAIL: Key (col_1, col_2)=(23, 2024-02-27 15:14:24.332557) already exists.
> 2024-02-27 16:20:09.272 IST [2834967] LOG: background worker "logical replication worker" (PID 2838457) exited with exit code 1
> Upon analysis, it appears that the errors stem from starting the logical replication with an incorrect LSN, one that has already been applied to the target side, leading to duplicate key conflicts.
> """
>
> In light of this issue, I seek guidance on determining the appropriate LSN from which to commence logical replication.
>
> To further clarify my problem:
>
> 1)I have a source machine and a target machine.
> 2) I perform a backup from the source to the target using pg_basebackup.
> 3) Prior to initiating the base backup, I create logical replication slots on the source machine.
> 4) During the execution of pg_basebackup, DML queries are executed, and I aim to replicate this data on the target machine.
> 5) My dilemma lies in determining the correct LSN to begin the logical replication process.
>

I think the reason of the problem you are seeing is pg_basebackup also
includes the WAL generated during backup if you specify -X method. See
[1]. Now, as you have created a logical slot before starting backup,
data duplication is possible. I don't see a very straightforward way
but you might be able to achieve your desired purpose if somehow
identify the last WAL location copied in backup and use that as your
starting point for logical replication.

[1] - https://www.postgresql.org/docs/devel/app-pgbasebackup.html

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikita Malakhov 2024-03-02 12:33:26 Re: Shared detoast Datum proposal
Previous Message Amit Kapila 2024-03-02 11:14:28 Re: Regardign RecentFlushPtr in WalSndWaitForWal()