Re: Skipping logical replication transactions on subscriber side

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Alexey Lesovsky <lesovsky(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Skipping logical replication transactions on subscriber side
Date: 2021-12-09 08:53:49
Message-ID: CAD21AoCG+X+xdbW=ZUxaqGN4G+b6JJW4vKdxbu6oRxa6NY8XNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Dec 8, 2021 at 4:36 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Wed, Dec 8, 2021 at 5:54 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Dec 8, 2021 at 12:36 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > >
> > > > On Wed, Dec 8, 2021 at 3:50 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > >
> > > > > On Wed, Dec 8, 2021 at 11:48 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > > >
> > > > > > >
> > > > > > > Can't we think of just allowing prepare in this case and updating the
> > > > > > > skip_xid only at commit time? I see that in this case, we would be
> > > > > > > doing prepare for a transaction that has no changes but as such cases
> > > > > > > won't be common, isn't that acceptable?
> > > > > >
> > > > > > In this case, we will end up committing both the prepared (empty)
> > > > > > transaction and the transaction that updates the catalog, right?
> > > > > >
> > > > >
> > > > > Can't we do this catalog update before committing the prepared
> > > > > transaction? If so, both in prepared and non-prepared cases, our
> > > > > implementation could be the same and we have a reason to accomplish
> > > > > the catalog update in the same transaction for which we skipped the
> > > > > changes.
> > > >
> > > > But in case of a crash between these two transactions, given that
> > > > skip_xid is already cleared how do we know the prepared transaction
> > > > that was supposed to be skipped?
> > > >
> > >
> > > I was thinking of doing it as one transaction at the time of
> > > commit_prepare. Say, in function apply_handle_commit_prepared(), if we
> > > check whether the skip_xid is the same as prepare_data.xid then update
> > > the catalog and set origin_lsn/timestamp in the same transaction. Why
> > > do we need two transactions for it?
> >
> > I meant the two transactions are the prepared transaction and the
> > transaction that updates the catalog. If I understand your idea
> > correctly, in apply_handle_commit_prepared(), we update the catalog
> > and set origin_lsn/timestamp. These are done in the same transaction.
> > Then, we commit the prepared transaction, right?
> >
>
> I am thinking that we can start a transaction, update the catalog,
> commit that transaction. Then start a new one to update
> origin_lsn/timestamp, finishprepared, and commit it. Now, if it
> crashes after the first transaction, only commit prepared will be
> resent again and this time we don't need to update the catalog as that
> entry would be already cleared.

Sounds good. In the crash case, it should be fine since we will just
commit an empty transaction. The same is true for the case where
skip_xid has been changed after skipping and preparing the transaction
and before handling commit_prepared.

Regarding the case where the user specifies XID of the transaction
after it is prepared on the subscriber (i.g., the transaction is not
empty), we won’t skip committing the prepared transaction. But I think
that we don't need to support skipping already-prepared transaction
since such transaction doesn't conflict with anything regardless of
having changed or not.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-12-09 09:16:13 Re: Skipping logical replication transactions on subscriber side
Previous Message Masahiko Sawada 2021-12-09 08:42:56 Re: Make pg_waldump report replication origin ID, LSN, and timestamp.