Re: Skipping logical replication transactions on subscriber side

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Alexey Lesovsky <lesovsky(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Skipping logical replication transactions on subscriber side
Date: 2021-12-14 02:49:44
Message-ID: CAA4eK1JQNOSY4rqBKwK-t5BXs8v0=0aVeLsT5qzWxGdcfJCGAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 13, 2021 at 6:55 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Mon, Dec 13, 2021 at 1:04 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Mon, Dec 13, 2021 at 8:28 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > > >
> > > > 4.
> > > > + * Also, one might think that we can skip preparing the skipped transaction.
> > > > + * But if we do that, PREPARE WAL record won’t be sent to its physical
> > > > + * standbys, resulting in that users won’t be able to find the prepared
> > > > + * transaction entry after a fail-over.
> > > > + *
> > > > ..
> > > > + */
> > > > + if (skipping_changes)
> > > > + stop_skipping_changes(false);
> > > >
> > > > Why do we need such a Prepare's entry either at current subscriber or
> > > > on its physical standby? I think it is to allow Commit-prepared. If
> > > > so, how about if we skip even commit prepared as well? Even on
> > > > physical standby, we would be having the value of skip_xid which can
> > > > help us to skip there as well after failover.
> > >
> > > It's true that skip_xid would be set also on physical standby. When it
> > > comes to preparing the skipped transaction on the current subscriber,
> > > if we want to skip commit-prepared I think we need protocol changes in
> > > order for subscribers to know prepare_lsn and preppare_timestampso
> > > that it can lookup the prepared transaction when doing
> > > commit-prepared. I proposed this idea before. This change would be
> > > benefical as of now since the publisher sends even empty transactions.
> > > But considering the proposed patch[1] that makes the puslisher not
> > > send empty transaction, this protocol change would be an optimization
> > > only for this feature.
> > >
> >
> > I was thinking to compare the xid received as part of the
> > commit_prepared message with the value of skip_xid to skip the
> > commit_prepared but I guess the user would change it between prepare
> > and commit prepare and then we won't be able to detect it, right? I
> > think we can handle this and the streaming case if we disallow users
> > to change the value of skip_xid when we are already skipping changes
> > or don't let the new skip_xid to reflect in the apply worker if we are
> > already skipping some other transaction. What do you think?
>
> In streaming cases, we don’t know when stream-commit or stream-abort
> comes and another conflict could occur on the subscription in the
> meanwhile. But given that (we expect) this feature is used after the
> apply worker enters into an error loop, this is unlikely to happen in
> practice unless the user sets the wrong XID. Similarly, in 2PC cases,
> we don’t know when commit-prepared or rollback-prepared comes and
> another conflict could occur in the meanwhile. But this could occur in
> practice even if the user specified the correct XID. Therefore, if we
> disallow to change skip_xid until the subscriber receives
> commit-prepared or rollback-prepared, we cannot skip the second
> transaction that conflicts with data on the subscriber.
>

I agree with this theory. Can we reflect this in comments so that in
the future we know why we didn't pursue this direction?

> From the application perspective, which behavior is preferable between
> skipping preparing a transaction and preparing an empty transaction,
> in the first place? From the resource consumption etc., skipping
> preparing transactions seems better. On the other hand, if we skipped
> preparing the transaction, the application would not be able to find
> the prepared transaction after a fail-over to the subscriber.
>

I am not sure how much it matters that such prepares are not present
because we wanted to some way skip the corresponding commit prepared
as well. I think your previous point is a good enough reason as to why
we should allow such prepares.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-12-14 03:02:53 Re: parallel vacuum comments
Previous Message Amit Kapila 2021-12-14 02:28:18 Re: Failed transaction statistics to measure the logical replication progress