Re: Skipping logical replication transactions on subscriber side

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Alexey Lesovsky <lesovsky(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Skipping logical replication transactions on subscriber side
Date: 2022-04-02 04:08:32
Message-ID: CAA4eK1L7KHvL+UrFiC-KHCx1MSODOYE6yPSggBgPcWajNDNWuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 2, 2022 at 7:29 AM Noah Misch <noah(at)leadboat(dot)com> wrote:
>
> On Sat, Apr 02, 2022 at 06:49:20AM +0530, Amit Kapila wrote:
>
> After applying datum_to_lsn_skiplsn_1.patch, I get another failure. Logs
> attached.
>

The failure is for the same reason. I noticed that even when skip lsn
value should be 0/0, it is some invalid value, see: "LOG: not started
skipping changes: my_skiplsn 0/B0706F72 finish_lsn 0/14EB7D8". Here,
my_skiplsn should be 0/0 instead of 0/B0706F72. Now, I am not sure why
the LSN's 4 bytes are correct and the other 4 bytes have some random
value. A similar problem is there when we have set the valid value of
skip lsn, see: "LOG: not started skipping changes: my_skiplsn
14EB7D8/B0706F72 finish_lsn 0/14EB7D8". Here the value of my_skiplsn
should be 0/14EB7D8 instead of 14EB7D8/B0706F72.

I am sure that if you create a subscription with the below test and
check the skip lsn value, it will be correct, otherwise, you would
have seen failure in subscription.sql as well. If possible, can you
please check the following example to rule out the possibility:

For example,
Publisher:
Create table t1(c1 int);
Create Publication pub1 for table t1;

Subscriber:
Create table t1(c1 int);
Create Subscription sub1 connection 'dbname = postgres' Publication pub1;
Select subname, subskiplsn from pg_subsription; -- subskiplsn should be 0/0

Alter Subscription sub1 SKIP (LSN = '0/14EB7D8');
Select subname, subskiplsn from pg_subsription; -- subskiplsn should
be 0/14EB7D8

Assuming the above is correct and we are still getting the wrong value
in apply worker, the only remaining suspect is the following code in
GetSubscription:
sub->skiplsn = DatumGetLSN(subform->subskiplsn);

I don't know what is wrong with this because subskiplsn is stored as
pg_lsn which is a fixed value and we should be able to access it by
struct. Do you see any problem with this?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2022-04-02 05:04:09 Re: wrong fds used for refilenodes after pg_upgrade relfilenode changes Reply-To:
Previous Message Michael Paquier 2022-04-02 03:10:49 Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set