Re: Skipping logical replication transactions on subscriber side

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Alexey Lesovsky <lesovsky(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Skipping logical replication transactions on subscriber side
Date: 2022-01-26 07:21:00
Message-ID: CAD21AoBf0yBo3Dbnm2Qjp4vk=KfVaH=7h00Doyor08nRH9yMBA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 26, 2022 at 1:43 PM David G. Johnston
<david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
> On Tue, Jan 25, 2022 at 9:16 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>> On Wed, Jan 26, 2022 at 9:36 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> > On Wed, Jan 26, 2022 at 12:54 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>> > > >
>> > > > Probably, we also need to consider the case where the tablesync worker
>> > > > entered an error loop and the user wants to skip the transaction? The
>> > > > apply worker is also running at the same time but it should not clear
>> > > > subskipxid. Similarly, the tablesync worker should not clear
>> > > > subskipxid if the apply worker wants to skip the transaction.
>> > > >
>> > >
>> > > I think for tablesync workers, the skip_xid set via this mechanism
>> > > won't work as we don't have any remote_xid for them, and neither any
>> > > XID is reported in the view for them.
>> >
>> > If the tablesync worker raises an error while applying changes after
>> > finishing the copy, it also reports the error XID.
>> >
>>
>> Right and agreed with your assessment for the same.
>>
>
> IIUC each tablesync process also performs an apply stage but only applies the messages related to the single table it is responsible for. Once all tablesync workers synchronize they are all destroyed and the main apply worker takes over and applies transactions to all subscribed tables.
>
> We probably should just provide an option for the user to specify "subrelid". If null, only the main apply worker will skip the given xid, otherwise only the worker tasked with syncing that particular table will do so. It might take a sequence of ALTER SUBSCRIPTION SET commands to get a broken initial table synchronization to load completely but at least there will not be any surprises as to which tables had transactions skipped and which did not.

That would work but I’m concerned that the users can specify it
properly. Also, we would need to change the errcontext message
generated by apply_error_callback() so the user can know that the
error occurred in either apply worker or tablesync worker.

Or, as another idea, since an error during table synchronization is
not common and could be resolved by truncating the table and
restarting the synchronization in practice, there might be no need
this much and we can support it only for apply worker errors.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2022-01-26 07:23:06 Re: Schema variables - new implementation for Postgres 15
Previous Message Michael Paquier 2022-01-26 06:56:43 Re: Is it correct to update db state in control file as "shutting down" during end-of-recovery checkpoint?