Re: Skipping logical replication transactions on subscriber side

From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Alexey Lesovsky <lesovsky(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Skipping logical replication transactions on subscriber side
Date: 2021-09-02 12:03:36
Message-ID: CAJcOf-eZF5gqOSWqSUpuivXYvA0xMV=ixXeKYOXHwrRfz4==2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> I've attached rebased patches. 0004 patch is not the scope of this
> patch. It's borrowed from another thread[1] to fix the assertion
> failure for newly added tests. Please review them.
>

Some initial comments for the v12-0003 patch:

(1) Patch comment
"This commit introduces another way to skip the transaction in question."

I think it should further explain: "This commit introduces another way
to skip the transaction in question, other than manually updating the
subscriber's database or using pg_replication_origin_advance()."

doc/src/sgml/logical-replication.sgml
(2)

Suggested minor update:

BEFORE:
+ transaction that conflicts with the existing data. When a conflict produce
+ an error, it is shown in
<structname>pg_stat_subscription_errors</structname>
+ view as follows:
AFTER:
+ transaction that conflicts with the existing data. When a conflict produces
+ an error, it is recorded in the
<structname>pg_stat_subscription_errors</structname>
+ view as follows:

(3)
+ found from those outputs (transaction ID 740 in the above case).
The transaction

Shouldn't it be transaction ID 716?

(4)
+ can be skipped by setting <replaceable>skip_xid</replaceable> to
the subscription

Is it better to say here ... "on the subscription" ?

(5)
Just skipping a transaction could make a subscriber inconsistent, right?

Would it be better as follows?

BEFORE:
+ In either way, those should be used as a last resort. They skip the whole
+ transaction including changes that may not violate any constraint and easily
+ make subscriber inconsistent if a user specifies the wrong transaction ID or
+ the position of origin.

AFTER:
+ Either way, those transaction skipping methods should be used as a
last resort.
+ They skip the whole transaction, including changes that may not violate any
+ constraint. They may easily make the subscriber inconsistent,
especially if a
+ user specifies the wrong transaction ID or the position of origin.

(6)
The grammar is not great in the following description, so here's a
suggested improvement:

BEFORE:
+ incoming change or by skipping the whole transaction. This option
+ specifies transaction ID that logical replication worker skips to
+ apply. The logical replication worker skips all data modification

AFTER:
+ incoming changes or by skipping the whole transaction. This option
+ specifies the ID of the transaction whose application is to
be skipped
+ by the logical replication worker. The logical replication worker
+ skips all data modification

src/backend/postmaster/pgstat.c
(7)
BEFORE:
+ * Tell the collector about clear the error of subscription.
AFTER:
+ * Tell the collector to clear the subscription error.

src/backend/replication/logical/worker.c
(8)
+ * subscription is invalidated and* MySubscription->skipxid gets
changed or reset.

There is a "*" after "and".

(9)
Do these lines really need to be moved up?

+ /* extract XID of the top-level transaction */
+ stream_xid = logicalrep_read_stream_start(s, &first_segment);
+

src/backend/postmaster/pgstat.c
(10)

+ bool m_clear; /* clear all fields except for last_failure and
+ * last_errmsg */

I think it should say: clear all fields except for last_failure,
last_errmsg and stat_reset_timestamp.

Regards,
Greg Nancarrow
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2021-09-02 12:36:34 Re: Allow escape in application_name (was: [postgres_fdw] add local pid to fallback_application_name)
Previous Message houzj.fnst@fujitsu.com 2021-09-02 11:54:15 Fix typo in comments