Re: Skipping logical replication transactions on subscriber side

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Alexey Lesovsky <lesovsky(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Skipping logical replication transactions on subscriber side
Date: 2021-09-06 05:49:49
Message-ID: CAD21AoABWLHp3FCLz77Wgopxu-6BZs61qSoJKUw9PTTtDbJ=ng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 4, 2021 at 12:24 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Sep 3, 2021 at 2:15 AM Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> wrote:
> >
> > > On Aug 30, 2021, at 12:06 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > I've attached rebased patches.
> > For the v12-0003 patch:
> >
> > I believe this feature is needed, but it also seems like a very powerful foot-gun. Can we do anything to make it less likely that users will hurt themselves with this tool?
> >
>
> This won't do any more harm than currently, users can do via
> pg_replication_slot_advance and the same is documented as well, see
> [1]. This will be allowed to only superusers. Its effect will be
> documented with a precautionary note to use it only when the other
> available ways can't be used.

Right.

>
> > I am thinking back to support calls I have attended. When a production system is down, there is often some hesitancy to perform ad-hoc operations on the database, but once the decision has been made to do so, people try to get the whole process done as quickly as possible. If multiple transactions on the publisher fail on the subscriber, they will do so in series, not in parallel.
> >
>
> The subscriber will know only one transaction failure at a time, till
> that is resolved, the apply won't move ahead and it won't know even if
> there are other transactions that are going to fail in the future.
>
> >
> > If the user could instead clear all failed transactions of the same type, that might make it less likely that they unthinkingly also skip subsequent errors of some different type. Perhaps something like ALTER SUBSCRIPTION ... SET (skip_failures = 'duplicate key value violates unique constraint "test_pkey"')?
> >
>
> I think if we want we can allow to skip particular error via
> skip_error_code instead of via error message but not sure if it would
> be better to skip a particular operation of the transaction rather
> than the entire transaction. Normally from the atomicity purpose the
> transaction can be either committed or rolled-back but not partially
> done so I think it would be preferable to skip the entire transaction
> rather than skipping it partially.

I think the suggestion by Mark is to skip the entire transaction if
the kind of error matches the specified error.

I think my proposed feature is meant to be a tool to cover the
situation like where something should not happen have happened, rather
than conflict resolution. If the users failed into a difficult
situation where they need to skip a lot of transaction by this
skip_xid feature, they should rebuild the logical replication from
scratch. It seems to me that skipping all transactions that failed due
to the same type of failure seems to be problematic, for example, if
the user forget to reset it. If we want to skip the particular
operation that failed due to the specified error, we should have a
proper conflict resolution feature that can handle various types of
conflicts by various types of resolutions methods, like other RDBMS
supports.

>
> > This is arguably a different feature request, and not something your patch is required to address, but I wonder how much we should limit people shooting themselves in the foot? If we built something like this using your skip_xid feature, rather than instead of your skip_xid feature, would your feature need to be modified?
> >
>
> Sawada-San can answer better but I don't see any problem building any
> such feature on top of what is currently proposed.

If the feature you proposed is to skip the entire transaction, I also
don't see any problem building the feature on top of my patch. The
patch adds the mechanism to skip the entire transaction so what we
need to do for that feature is to extend how to trigger the skipping
behavior.

>
> >
> > I'm having trouble thinking of an example conflict where skipping a transaction would be better than writing a BEFORE INSERT trigger on the conflicting table which suppresses or redirects conflicting rows somewhere else. Particularly for larger transactions containing multiple statements, suppressing the conflicting rows using a trigger would be less messy than skipping the transaction. I think your patch adds a useful tool to the toolkit, but maybe we should mention more alternatives in the docs? Something like, "changing the data on the subscriber so that it doesn't conflict with incoming changes, or dropping the conflicting constraint or unique index, or writing a trigger on the subscriber to suppress or redirect conflicting incoming changes, or as a last resort, by skipping the whole transaction"?
> >
>
> +1 for extending the docs as per this suggestion.

Agreed. I'll add such description to the doc.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-09-06 05:56:15 Re: Added missing invalidations for all tables publication
Previous Message Dilip Kumar 2021-09-06 05:47:09 Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints