Re: logical decoding and replication of sequences, take 2

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Subject: Re: logical decoding and replication of sequences, take 2
Date: 2023-12-13 12:56:34
Message-ID: CAA4eK1LFise9iN+NN=agrk4prR1qD+ebvzNjKAWUog2+hy3HxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 7, 2023 at 10:41 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Wed, Dec 6, 2023 at 7:09 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> >
> > Yes, if something like this happens, that'd be a problem:
> >
> > 1) decoding starts, with
> >
> > SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT
> >
> > 2) transaction that creates a new refilenode gets decoded, but we skip
> > it because we don't have the correct snapshot
> >
> > 3) snapshot changes to SNAPBUILD_FULL_SNAPSHOT
> >
> > 4) we decode sequence change from nextval() for the sequence
> >
> > This would lead to us attempting to apply sequence change for a
> > relfilenode that's not visible yet (and may even get aborted).
> >
> > But can this even happen? Can we start decoding in the middle of a
> > transaction? How come this wouldn't affect e.g. XLOG_HEAP2_NEW_CID,
> > which is also skipped until SNAPBUILD_FULL_SNAPSHOT. Or logical
> > messages, where we also call the output plugin in non-transactional cases.
>
> It's not a problem for logical messages because whether the message is
> transaction or non-transactional is decided while WAL logs the message
> itself. But here our problem starts with deciding whether the change
> is transactional vs non-transactional, because if we insert the
> 'relfilenode' in hash then the subsequent sequence change in the same
> transaction would be considered transactional otherwise
> non-transactional.
>

It is correct that we can make a wrong decision about whether a change
is transactional or non-transactional when sequence DDL happens before
the SNAPBUILD_FULL_SNAPSHOT state and the sequence operation happens
after that state. However, one thing to note here is that we won't try
to stream such a change because for non-transactional cases we don't
proceed unless the snapshot is in a consistent state. Now, if the
decision had been correct then we would probably have queued the
sequence change and discarded at commit.

One thing that we deviate here is that for non-sequence transactional
cases (including logical messages), we immediately start queuing the
changes as soon as we reach SNAPBUILD_FULL_SNAPSHOT state (provided
SnapBuildProcessChange() returns true which is quite possible) and
take final decision at commit/prepare/abort time. However, that won't
be the case for sequences because of the dependency of determining
transactional cases on one of the prior records. Now, I am not
completely sure at this stage if such a deviation can cause any
problem and or whether we are okay to have such a deviation for
sequences.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2023-12-13 13:16:20 Re: trying again to get incremental backup
Previous Message Maxim Orlov 2023-12-13 12:25:30 Re: Add 64-bit XIDs into PostgreSQL 15