RE: Failed transaction statistics to measure the logical replication progress

From: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>
To: "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, 'Greg Nancarrow' <gregn4422(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "vignesh21(at)gmail(dot)com" <vignesh21(at)gmail(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "sawada(dot)mshk(at)gmail(dot)com" <sawada(dot)mshk(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Failed transaction statistics to measure the logical replication progress
Date: 2022-01-04 11:52:21
Message-ID: TYCPR01MB83732A6796425B20CDC76987ED4A9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, December 31, 2021 10:12 AM Tang, Haiying/唐 海英 <tanghy(dot)fnst(at)fujitsu(dot)com> wrote:
> On Wednesday, December 22, 2021 6:14 PM osumi(dot)takamichi(at)fujitsu(dot)com
> <osumi(dot)takamichi(at)fujitsu(dot)com> wrote:
> >
> > Attached the new patch v19.
> >
>
> Thanks for your patch. I think it's better if you could add this patch to the
> commitfest.
> Here are some comments:
Thank you for your review !
I've created one entry in the next commitfest for this patch [1]

>
> 1)
> + <structfield>commit_count</structfield> <type>bigint</type>
> + </para>
> + <para>
> + Number of transactions successfully applied in this subscription.
> + Both COMMIT and COMMIT PREPARED increment this counter.
> + </para></entry>
> + </row>
> ...
>
> I think the commands (like COMMIT, COMMIT PREPARED ...) can be
> surrounded with "<command> </command>", thoughts?
Makes sense to me. Fixed.

Note that to the user perspective,
we should write only COMMIT and COMMIT PREPARED in the documentation.
Thus, I don't list up other commands.

I wrapped ROLLBACK PREPARED for abort_count as well.

> 2)
> +extern void pgstat_report_subworker_xact_end(LogicalRepWorker
> *repWorker,
> +
> LogicalRepMsgType command,
> +
> bool bforce);
>
> Should "bforce" be "force"?
Fixed the typo.

> 3)
> + * This should be called before the call of process_syning_tables() so
> + to not
>
> "process_syning_tables()" should be "process_syncing_tables()".
Fixed.

> 4)
> +void
> +pgstat_send_subworker_xact_stats(LogicalRepWorker *repWorker, bool
> +force) {
> + static TimestampTz last_report = 0;
> + PgStat_MsgSubWorkerXactEnd msg;
> +
> + if (!force)
> + {
> ...
> + if (!TimestampDifferenceExceeds(last_report, now,
> PGSTAT_STAT_INTERVAL))
> + return;
> + last_report = now;
> + }
> +
> ...
> + if (repWorker->commit_count == 0 && repWorker->abort_count ==
> 0)
> + return;
> ...
>
> I think it's better to check commit_count and abort_count first, then check if
> reach PGSTAT_STAT_INTERVAL.
> Otherwise if commit_count and abort_count are 0, it is possible that the value
> of last_report has been updated but it didn't send stats in fact. In this case,
> last_report is not the real time that send last message.
Yeah, agreed. This fix is right in terms of the variable name aspect.

The only scenario that we can take advantage of the previous implementation of
v19's pgstat_send_subworker_xact_stats() should be a case where
we execute a bunch of commit-like logical replication apply messages
within PGSTAT_STAT_INTERVAL intensively and continuously for long period,
because we check "repWorker->commit_count == 0 && repWorker->abort_count == 0"
just once before calling pgstat_send() in this case.
*But*, this scenario didn't look reasonable. In other words,
the way to call TimestampDifferenceExceeds() only if there's any need of
update for commit_count/abort_count looks less heavy.
Accordingly, I've fixed it as you suggested.
Also, I modified some comments in pgstat_send_subworker_xact_stats() for this change.

Kindly have a look at the v20 shared in [2].

[1] - https://commitfest.postgresql.org/37/3504/
[2] - https://www.postgresql.org/message-id/TYCPR01MB8373AB2AE1A6EC7B9E012519ED4A9%40TYCPR01MB8373.jpnprd01.prod.outlook.com

Best Regards,
Takamichi Osumi

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-01-04 12:01:34 Re: pg_dump/restore --no-tableam
Previous Message Simon Riggs 2022-01-04 11:49:28 Re: Documenting when to retry on serialization failure