From: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: logical replication: restart_lsn can go backwards (and more), seems broken since 9.4 |
Date: | 2024-11-11 13:51:04 |
Message-ID: | CAExHW5sC6qQ=GZHij2o=Kj4gsMnJuE_PExs39Pn3-BqgAL4haA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Nov 9, 2024 at 5:05 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
> On 11/8/24 15:57, Ashutosh Bapat wrote:
> > ...
> >
> > After examining the code before reading [2], I came to the same
> > conclusion as Masahiko-san in [2]. We install candidate_restart_lsn
> > based on the running transaction record whose LSN is between
> > restart_lsn and confirmed_flush_lsn. Since candidate_restart_valid of
> > such candidates would be lesser than any confirmed_flush_lsn received
> > from downstream. I am surprised that the fix suggested by Masahiko-san
> > didn't work though. The fix also fix the asymmetry, between
> > LogicalIncreaseXminForSlot and LogicalIncreaseRestartDecodingForSlot,
> > that you have pointed out in your next email. What behaviour do you
> > see with that fix applied?
> >
> >
> > [1] https://www.postgresql.org/message-id/Yz2hivgyjS1RfMKs%40depesz.com
> > [2] https://www.postgresql.org/message-id/CAD21AoBVhYnGBuW_o%3DwEGgTp01qiHNAx1a14b1X9kFXmuBe%3Dsg%40mail.gmail.com
> >
> >
>
> I read that message (and the earlier discussion multiple times) while
> investigating the issue, and TBH it's not very clear to me what the
> conclusion is :-(
>
> There's some discussion about whether the candidate fields should be
> reset on release or not. There are even claims that it might be
> legitimate to not reset the fields and update the restart_lsn. Using
> such "stale" LSN values seems rather suspicious to me, but I don't have
> a proof that it's incorrect. FWIW this unclarity is what I mentioned the
> policy/contract for candidate fields is not explained anywhere.
>
> That being said, I gave that fix a try - see the attached 0001 patch. It
> tweaks LogicalIncreaseRestartDecodingForSlot (it needs a bit more care
> because of the spinlock), and it adds a couple asserts to make sure the
> data.restart_lsn never moves back.
>
> And indeed, with this my stress test script does not crash anymore.
:)
I think the problem is about processing older running transactions
record and setting data.restart_lsn based on the candidates those
records produce. But what is not clear to me is how come a newer
candidate_restart_lsn is available immediately upon WAL sender
restart. I.e. in the sequence of events you mentioned in your first
email
1. 344.139 LOG: starting logical decoding for slot "s1"
2. 344.139 DETAIL: Streaming transactions committing after 1/E97EAC30,
reading WAL from 1/E96FB4D0.
3. 344.140 LOG: logical decoding found consistent point at 1/E96FB4D0
4. 344.140 DETAIL: Logical decoding will begin using saved snapshot.
5. 344.140 LOG: LogicalConfirmReceivedLocation 1/E9865398
6. 344.140 LOG: LogicalConfirmReceivedLocation updating
data.restart_lsn to 1/E979D4C8 (from 1/E96FB4D0)
candidate_restart_valid 0/0 (from 1/E9865398)
candidate_restart_lsn 0/0 (from 1/E979D4C8)
how did candidate_restart_lsn = 1/E979D4C8 and candidate_restart_valid
= 1/E9865398 were set in ReplicationSlot after WAL sender? It means it
must have read and processed running transaction record at 1/E9865398.
If that's true, how come it went back to a running transactions WAL
record at 1/E979D4C8? It should be reading WAL records sequentially,
hence read 1/E979D4C8 first then 1/E9865398.
344.145 LOG: LogicalIncreaseRestartDecodingForSlot s1
candidate_restart_valid_lsn 1/E979D4C8 (0/0)
candidate_restart_lsn 1/E96FB4D0 (0/0)
--
Best Wishes,
Ashutosh Bapat
From | Date | Subject | |
---|---|---|---|
Next Message | Ashutosh Bapat | 2024-11-11 13:58:24 | Re: Add html-serve target to autotools and meson |
Previous Message | Heikki Linnakangas | 2024-11-11 13:25:29 | Re: Support LIKE with nondeterministic collations |