RE: Failed transaction statistics to measure the logical replication progress

From: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>
To: 'vignesh C' <vignesh21(at)gmail(dot)com>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Failed transaction statistics to measure the logical replication progress
Date: 2021-11-09 11:39:32
Message-ID: TYCPR01MB8373E39FF56C6F32A122CB94ED929@TYCPR01MB8373.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Monday, November 8, 2021 3:12 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> On Fri, Nov 5, 2021 at 1:42 PM osumi(dot)takamichi(at)fujitsu(dot)com
> <osumi(dot)takamichi(at)fujitsu(dot)com> wrote:
> > Lastly, I removed one unnecessary test that checked publisher's stats
> > in the TAP tests.
> > Also I introduced ApplyTxnExtraData structure to remove void* argument
> > of update_apply_change_size that might worsen the readability of codes
> > in the previous version.
>
> Thanks for the updated patch.
Thanks you for checking my patch !

> Few comments:
> 1) You could remove LogicalRepPreparedTxnData,
> LogicalRepCommitPreparedTxnData & LogicalRepRollbackPreparedTxnData
> and change it to char *gid to reduce the function parameter and simiplify the
> assignment:
> + */
> +void
> +pgstat_report_subworker_twophase_xact(Oid subid, LogicalRepMsgType
> +command,
> +
> PgStat_Counter xact_size,
> +
> LogicalRepPreparedTxnData *prepare_data,
> +
> LogicalRepCommitPreparedTxnData *commit_data,
> +
> LogicalRepRollbackPreparedTxnData *rollback_data)
Fixed.


> 2) Shouldn't this change be part of skip xid patch?
> - TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
> + TupleDescInitEntry(tupdesc, (AttrNumber) 10,
> + "last_error_command",
> TEXTOID, -1, 0);
> - TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
> + TupleDescInitEntry(tupdesc, (AttrNumber) 11, "last_error_xid",
> XIDOID, -1, 0);
> - TupleDescInitEntry(tupdesc, (AttrNumber) 6, "error_count",
> + TupleDescInitEntry(tupdesc, (AttrNumber) 12, "last_error_count",
> INT8OID, -1, 0);
> - TupleDescInitEntry(tupdesc, (AttrNumber) 7, "error_message",
> + TupleDescInitEntry(tupdesc, (AttrNumber) 13,
> + "last_error_message",
> TEXTOID, -1, 0);
> - TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_error_time",
> + TupleDescInitEntry(tupdesc, (AttrNumber) 14, "last_error_time",
Hmm, I didn't think so. Those renames are necessary
to make exisiting columns of skip xid separate from newly-introduced xact stats.
That means, original names of skip xid columns in v20 by itself are fine
and the renames are needed only when this patch gets committed.
At present, we cannot guarantee that this patch will be committed
so I'd like to take care of those renames.


> 3) This newly added structures should be added to typedefs.list:
> ApplyTxnExtraData
> XactSizeEntry
> PgStat_MsgSubWorkerXactEnd
> PgStat_MsgSubWorkerTwophaseXact
> PgStat_StatSubWorkerPreparedXact
> PgStat_StatSubWorkerPreparedXactSize
Added.

> 4) We are not sending the transaction size in case of table sync, is this
> intentional, if so I felt we should document this in
> pg_stat_subscription_workers
> + /* Report the success of table sync. */
> + pgstat_report_subworker_xact_end(MyLogicalRepWorker->subid,
> +
> MyLogicalRepWorker->relid,
> +
> 0 /* no logical message type */,
> +
> 0 /* xact size */);
Right. Updated the doc description.
I added the description in a way that the bytes
stats are only for apply worker.


> 5) pg_stat_subscription_workers has a lot of columns, if we can reduce the
> column size the readability will improve, like xact_commit_count to
> commit_count, xact_commit_bytes to commit_bytes, etc
> + w.xact_commit_count,
> + w.xact_commit_bytes,
> + w.xact_error_count,
> + w.xact_error_bytes,
> + w.xact_abort_count,
> + w.xact_abort_bytes,
It makes sense. Those can be somehow redundant.

Tentatively, I renamed only columns' names exported to the users.
This is because changing internal data structure as well (e.g. removing
the PgStat_StatSubWorkerEntry's prefixes) causes duplication name
of 'error_count' members and changing such an internal data structure
of skip xid part will have a huge impact of other parts.
Kindly imagine a case that we add 'last_' prefix to the
all error statistics representing an error of the structure.
If you aren't satisfied with this change, please let me know.

Best Regards,
Takamichi Osumi

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message osumi.takamichi@fujitsu.com 2021-11-09 11:55:33 RE: Failed transaction statistics to measure the logical replication progress
Previous Message osumi.takamichi@fujitsu.com 2021-11-09 11:35:24 RE: Failed transaction statistics to measure the logical replication progress