Re: logical decoding and replication of sequences, take 2

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Subject: Re: logical decoding and replication of sequences, take 2
Date: 2023-12-14 05:22:57
Message-ID: CAFiTN-sYpyUBabxopJysqH3DAp4OZUCTi6m_qtgt8d32vDcWSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 13, 2023 at 6:26 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> > > But can this even happen? Can we start decoding in the middle of a
> > > transaction? How come this wouldn't affect e.g. XLOG_HEAP2_NEW_CID,
> > > which is also skipped until SNAPBUILD_FULL_SNAPSHOT. Or logical
> > > messages, where we also call the output plugin in non-transactional cases.
> >
> > It's not a problem for logical messages because whether the message is
> > transaction or non-transactional is decided while WAL logs the message
> > itself. But here our problem starts with deciding whether the change
> > is transactional vs non-transactional, because if we insert the
> > 'relfilenode' in hash then the subsequent sequence change in the same
> > transaction would be considered transactional otherwise
> > non-transactional.
> >
>
> It is correct that we can make a wrong decision about whether a change
> is transactional or non-transactional when sequence DDL happens before
> the SNAPBUILD_FULL_SNAPSHOT state and the sequence operation happens
> after that state. However, one thing to note here is that we won't try
> to stream such a change because for non-transactional cases we don't
> proceed unless the snapshot is in a consistent state. Now, if the
> decision had been correct then we would probably have queued the
> sequence change and discarded at commit.
>
> One thing that we deviate here is that for non-sequence transactional
> cases (including logical messages), we immediately start queuing the
> changes as soon as we reach SNAPBUILD_FULL_SNAPSHOT state (provided
> SnapBuildProcessChange() returns true which is quite possible) and
> take final decision at commit/prepare/abort time. However, that won't
> be the case for sequences because of the dependency of determining
> transactional cases on one of the prior records. Now, I am not
> completely sure at this stage if such a deviation can cause any
> problem and or whether we are okay to have such a deviation for
> sequences.

Okay, so this particular scenario that I raised is somehow saved, I
mean although we are considering transactional sequence operation as
non-transactional we also know that if some of the changes for a
transaction are skipped because the snapshot was not FULL that means
that transaction can not be streamed because that transaction has to
be committed before snapshot become CONSISTENT (based on the snapshot
state change machinery). Ideally based on the same logic that the
snapshot is not consistent the non-transactional sequence changes are
also skipped. But the only thing that makes me a bit uncomfortable is
that even though the result is not wrong we have made some wrong
intermediate decisions i.e. considered transactional change as
non-transactions.

One solution to this issue is that, even if the snapshot state does
not reach FULL just add the sequence relids to the hash, I mean that
hash is only maintained for deciding whether the sequence is changed
in that transaction or not. So no adding such relids to hash seems
like a root cause of the issue. Honestly, I haven't analyzed this
idea in detail about how easy it would be to add only these changes to
the hash and what are the other dependencies, but this seems like a
worthwhile direction IMHO.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-12-14 05:38:22 Re: Synchronizing slots from primary to standby
Previous Message Peter Smith 2023-12-14 04:45:21 Re: Synchronizing slots from primary to standby