Re: Skipping logical replication transactions on subscriber side

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Alexey Lesovsky <lesovsky(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Skipping logical replication transactions on subscriber side
Date: 2022-01-14 12:05:25
Message-ID: CALDaNm2VCHuL2LzSeUeCcCmuzJxwrPjXPaEE6Sr989UAsrTSYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 14, 2022 at 7:49 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Wed, Jan 12, 2022 at 11:10 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > On Wed, Jan 12, 2022 at 11:32 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Jan 12, 2022 at 12:21 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > On Wed, Jan 12, 2022 at 5:49 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > >
> > > > > On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > > >
> > > > > > On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > > > >
> > > > > > > On second thought, the same is true for other cases, for example,
> > > > > > > preparing the transaction and clearing skip_xid while handling a
> > > > > > > prepare message. That is, currently we don't clear skip_xid while
> > > > > > > handling a prepare message but do that while handling commit/rollback
> > > > > > > prepared message, in order to avoid the worst case. If we do both
> > > > > > > while handling a prepare message and the server crashes between them,
> > > > > > > it ends up that skip_xid is cleared and the transaction will be
> > > > > > > resent, which is identical to the worst-case above.
> > > > > > >
> > > > > >
> > > > > > How are you thinking to update the skip xid before prepare? If we do
> > > > > > it in the same transaction then the changes in the catalog will be
> > > > > > part of the prepared xact but won't be committed. Now, say if we do it
> > > > > > after prepare, then the situation won't be the same because after
> > > > > > restart the same xact won't appear again.
> > > > >
> > > > > I was thinking to commit the catalog change first in a separate
> > > > > transaction while not updating origin LSN and then prepare an empty
> > > > > transaction while updating origin LSN.
> > > > >
> > > >
> > > > But, won't it complicate the handling if in the future we try to
> > > > enhance this API such that it skips partial changes like skipping only
> > > > for particular relation(s) or particular operations as discussed
> > > > previously in this thread?
> > >
> > > Right. I was thinking that if we accept the situation that the user
> > > has to set skip_xid again in case of the server crashes, we might be
> > > able to accept also the situation that the user has to clear skip_xid
> > > in a case of the server crashes. But it seems the former is less
> > > problematic.
> > >
> > > I've attached an updated patch that incorporated all comments I got so far.
> >
> > Thanks for the updated patch, few comments:
>
> Thank you for the comments!
>
> > 1) Currently skip xid is not displayed in describe subscriptions, can
> > we include it too:
> > \dRs+ sub1
> > List of subscriptions
> > Name | Owner | Enabled | Publication | Binary | Streaming | Two
> > phase commit | Synchronous commit | Conninfo
> > ------+---------+---------+-------------+--------+-----------+------------------+--------------------+--------------------------------
> > sub1 | vignesh | t | {pub1} | f | f | e
> > | off | dbname=postgres host=localhost
> > (1 row)
> >
> > 2) This import "use PostgreSQL::Test::Utils;" is not required:
> > +# Tests for skipping logical replication transactions.
> > +use strict;
> > +use warnings;
> > +use PostgreSQL::Test::Cluster;
> > +use PostgreSQL::Test::Utils;
> > +use Test::More tests => 6;
> >
> > 3) Some of the comments uses a punctuation mark and some of them does
> > not use, Should we keep it consistent:
> > + # Wait for worker error
> > + $node_subscriber->poll_query_until(
> > + 'postgres',
> >
> > + # Set skip xid
> > + $node_subscriber->safe_psql(
> > + 'postgres',
> >
> > +# Create publisher node.
> > +my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
> > +$node_publisher->init(allows_streaming => 'logical');
> >
> >
> > +# Create subscriber node.
> > +my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
> >
> > 4) Should this be changed:
> > + * True if we are skipping all data modification changes (INSERT,
> > UPDATE, etc.) of
> > + * the specified transaction at MySubscription->skipxid. Once we
> > start skipping
> > + * changes, we don't stop it until the we skip all changes of the
> > transaction even
> > + * if pg_subscription is updated that and MySubscription->skipxid
> > gets changed or
> > to:
> > + * True if we are skipping all data modification changes (INSERT,
> > UPDATE, etc.) of
> > + * the specified transaction at MySubscription->skipxid. Once we
> > start skipping
> > + * changes, we don't stop it until we skip all changes of the transaction even
> > + * if pg_subscription is updated that and MySubscription->skipxid
> > gets changed or
> >
> > In "stop it until the we skip all changes", here the is not required.
> >
>
> I agree with all the comments above. I've attached an updated patch.

Thanks for the updated patch, few minor comments:
1) Should "SKIP" be "SKIP (" here:
@@ -1675,7 +1675,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH
PUBLICATION", "SET",
+ "RENAME TO", "REFRESH
PUBLICATION", "SET", "SKIP",

2) We could add a test for this if possible:
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ if (!superuser())
+ ereport(ERROR,
+
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must
be superuser to skip transaction")));

3) There was one typo in commit message, transaciton shoudl be transaction:
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid.

Another small typo, susbscriber should be subscriber:
+ prepared by enabling <literal>two_phase</literal> on susbscriber. After
+ the logical replication successfully skips the transaction, the
transaction

4) Should skipsubxid be mentioned as subskipxid here:
+ * Clear the subskipxid of pg_subscription catalog. This catalog
+ * update must be committed before finishing prepared transaction.
+ * Because otherwise, in a case where the server crashes between
+ * finishing prepared transaction and the catalog update, COMMIT
+ * PREPARED won’t be resent but skipsubxid is left.

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2022-01-14 12:14:02 Re: Parallel Inserts (WAS: [bug?] Missed parallel safety checks..)
Previous Message Amit Kapila 2022-01-14 12:02:57 Re: Logical replication timeout problem