Re: logical decoding and replication of sequences, take 2

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: logical decoding and replication of sequences, take 2
Date: 2023-11-27 06:04:46
Message-ID: CAA4eK1LTaGxk54oAu=tPfe8XB3-3JvbJqCAFw0px1mROpZTw6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 27, 2023 at 6:41 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> I've been cleaning up the first two patches to get them committed soon
> (adding the decoding infrastructure + test_decoding), cleaning up stale
> comments, updating commit messages etc. And I think it's ready to go,
> but it's too late over, so I plan going over once more tomorrow and then
> likely push. But if someone wants to take a look, I'd welcome that.
>
> The one issue I found during this cleanup is that the patch was missing
> the changes introduced by 29d0a77fa660 for decoding of other stuff.
>
> commit 29d0a77fa6606f9c01ba17311fc452dabd3f793d
> Author: Amit Kapila <akapila(at)postgresql(dot)org>
> Date: Thu Oct 26 06:54:16 2023 +0530
>
> Migrate logical slots to the new node during an upgrade.
> ...
>
> I fixed that, but perhaps someone might want to double check ...
>
>
> 0003 is here just for completeness - that's the part adding sequences to
> built-in replication. I haven't done much with it, it needs some cleanup
> too to get it committable. I don't intend to push that right after
> 0001+0002, though.
>
>
> While going over 0001, I realized there might be an optimization for
> ReorderBufferSequenceIsTransactional. As coded in 0001, it always
> searches through all top-level transactions, and if there's many of them
> that might be expensive, even if very few of them have any relfilenodes
> in the hash table. It's still linear search, and it needs to happen for
> each sequence change.
>
> But can the relfilenode even be in some other top-level transaction? How
> could it be - our transaction would not see it, and wouldn't be able to
> generate the sequence change. So we should be able to simply check *our*
> transaction (or if it's a subxact, the top-level transaction). Either
> it's there (and it's transactional change), or not (and then it's
> non-transactional change).
>

I also think the relfilenode should be part of either the current
top-level xact or one of its subxact, so looking at all the top-level
transactions for each change doesn't seem advisable.

> The 0004 does this.
>
> This of course hinges on when exactly the transactions get created, and
> assignments processed. For example if this would fire before the txn
> gets assigned to the top-level one, this would break. I don't think this
> can happen thanks to the immediate logging of assignments, but I'm too
> tired to think about it now.
>

This needs some thought because I think we can't guarantee the
association till we reach the point where we can actually decode the
xact. See comments in AssertTXNLsnOrder() [1].

I noticed few minor comments while reading the patch:
1.
+ * turned on here because the non-transactional logical message is
+ * decoded without waiting for these records.

Instead of '.. logical message', shouldn't we say sequence change message?

2.
+ /*
+ * If we found an entry with matchine relfilenode,

typo (matchine)

3.
+ Note that this may not the value obtained by the process updating the
+ process, but the future sequence value written to WAL (typically about
+ 32 values ahead).

/may not the value/may not be the value

[1] -
/*
* Skip the verification if we don't reach the LSN at which we start
* decoding the contents of transactions yet because until we reach the
* LSN, we could have transactions that don't have the association between
* the top-level transaction and subtransaction yet and consequently have
* the same LSN. We don't guarantee this association until we try to
* decode the actual contents of transaction.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2023-11-27 06:14:32 Re: Incorrect comment in tableam.h regarding GetHeapamTableAmRoutine()
Previous Message Zhijie Hou (Fujitsu) 2023-11-27 06:02:00 RE: Synchronizing slots from primary to standby