Re: Failed transaction statistics to measure the logical replication progress

From: vignesh C <vignesh21(at)gmail(dot)com>
To: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Failed transaction statistics to measure the logical replication progress
Date: 2021-12-03 06:11:32
Message-ID: CALDaNm32HHjdwmuoF+Nw5CU70r819Kq+Tmat3bzkkvSv_2u=gA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 1, 2021 at 3:04 PM osumi(dot)takamichi(at)fujitsu(dot)com
<osumi(dot)takamichi(at)fujitsu(dot)com> wrote:
>
> On Friday, November 19, 2021 11:11 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > Besides that, I’m not sure how useful commit_bytes, abort_bytes, and
> > error_bytes are. I originally thought these statistics track the size of received
> > data, i.g., how much data is transferred from the publisher and processed on
> > the subscriber. But what the view currently has is how much memory is used in
> > the subscription worker. The subscription worker emulates
> > ReorderBufferChangeSize() on the subscriber side but, as the comment of
> > update_apply_change_size() mentions, the size in the view is not accurate:
> ...
> > I guess that the purpose of these values is to compare them to total_bytes,
> > stream_byte, and spill_bytes but if the calculation is not accurate, does it mean
> > that the more stats are updated, the more the stats will be getting inaccurate?
> Thanks for your comment !
>
> I tried to solve your concerns about byte columns but there are really difficult issues to solve.
> For example, to begin with the messages of apply worker are different from those of
> reorder buffer.
>
> Therefore, I decided to split the previous patch and make counter columns go first.
> v14 was checked by pgperltidy and pgindent.
>
> This patch can be applied to the PG whose commit id is after 8d74fc9 (introduction of
> pg_stat_subscription_workers).

Thanks for the updated patch.
Currently we are storing the commit count, error_count and abort_count
for each table of the table sync operation. If we have thousands of
tables, we will be storing the information for each of the tables.
Shouldn't we be storing the consolidated information in this case.
diff --git a/src/backend/replication/logical/tablesync.c
b/src/backend/replication/logical/tablesync.c
index f07983a..02e9486 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1149,6 +1149,11 @@ copy_table_done:
MyLogicalRepWorker->relstate_lsn = *origin_startpos;
SpinLockRelease(&MyLogicalRepWorker->relmutex);

+ /* Report the success of table sync. */
+ pgstat_report_subworker_xact_end(MyLogicalRepWorker->subid,
+
MyLogicalRepWorker->relid,
+
0 /* no logical message type */ );

postgres=# select * from pg_stat_subscription_workers ;
subid | subname | subrelid | commit_count | error_count | abort_count
| last_error_relid | last_error_command | last_error_xid |
last_error_count | last_error_message | last_error_time
-------+---------+----------+--------------+-------------+-------------+------------------+--------------------+----------------+------------------+--------------------+-----------------
16411 | sub1 | 16387 | 1 | 0 | 0
| | | |
0 | |
16411 | sub1 | 16396 | 1 | 0 | 0
| | | |
0 | |
16411 | sub1 | 16390 | 1 | 0 | 0
| | | |
0 | |
16411 | sub1 | 16393 | 1 | 0 | 0
| | | |
0 | |
16411 | sub1 | 16402 | 1 | 0 | 0
| | | |
0 | |
16411 | sub1 | 16408 | 1 | 0 | 0
| | | |
0 | |
16411 | sub1 | 16384 | 1 | 0 | 0
| | | |
0 | |
16411 | sub1 | 16399 | 1 | 0 | 0
| | | |
0 | |
16411 | sub1 | 16405 | 1 | 0 | 0
| | | |
0 | |
(9 rows)

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-12-03 06:16:55 Re: [PATCH] support tab-completion for single quote input with equal sign
Previous Message houzj.fnst@fujitsu.com 2021-12-03 05:54:21 RE: Data is copied twice when specifying both child and parent table in publication