Re: Skipping logical replication transactions on subscriber side

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Alexey Lesovsky <lesovsky(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Skipping logical replication transactions on subscriber side
Date: 2022-01-26 02:28:01
Message-ID: CAA4eK1LANoQHgP4Rfe04nf1RWNhwckJ-+Osp+hmm7JuYZ+Wufw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 25, 2022 at 8:39 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Jan 25, 2022 at 11:58 PM David G. Johnston
> <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
> >
> > On Tue, Jan 25, 2022 at 7:47 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >>
> >> Yeah, I think it's a good idea to clear the subskipxid after the first
> >> transaction regardless of whether the worker skipped it.
> >>
> >
> > So basically instead of stopping the worker with an error you suggest having the worker continue applying changes (after resetting subskipxid, and - arguably - the ?_error_* fields). Log the transaction xid mis-match as a warning in the log file as opposed to an error.
>
> Agreed, I think it's better to log a warning than to raise an error.
> In the case where the user specified the wrong XID, the worker should
> fail again due to the same error.
>

IIUC, the proposal is to compare the skip_xid with the very
transaction the apply worker received to apply and raise a warning if
it doesn't match with skip_xid and then continue. This seems like a
reasonable idea but can we guarantee that it is always the first
transaction that we want to skip? We seem to guarantee that we won't
get something again once it is written durably/flushed on the
subscriber side. I guess here it can happen that before the errored
transaction, there is some empty xact, or maybe part of the stream
(consider streaming transactions) of some xact, or there could be
other cases as well where the server will send those xacts again.

Now, if the above reasoning is correct then I think your proposal to
clear the skip_xid in the catalog as soon as we have applied the first
transaction successfully seems reasonable to me.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-01-26 02:38:13 Re: Skipping logical replication transactions on subscriber side
Previous Message Tomas Vondra 2022-01-26 02:16:57 Re: logical decoding and replication of sequences