| From: | Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> |
|---|---|
| To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, michael(at)paquier(dot)xyz, bertranddrouvot(dot)pg(at)gmail(dot)com, andres(at)anarazel(dot)de, shveta(dot)malik(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Report bytes and transactions actually sent downtream |
| Date: | 2026-06-30 07:01:06 |
| Message-ID: | CAE9k0PnBLCA0ZrzanQVyQCRnfk3rRZqctejCC3Cp9ZnhjMiyvQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Thank you, Masahiko-san, for your review comments. Please find my
responses inline below:
On Tue, Jun 30, 2026 at 4:32 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Mon, Jun 29, 2026 at 2:08 AM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
> >
> Thank you for updating the patch! The new column name output_bytes
> looks good to me.
>
> Here are some comments:
>
> memset(nulls, 0, sizeof(nulls));
> values[0] = LSNGetDatum(lsn);
> + outputBytes += sizeof(XLogRecPtr);
> values[1] = TransactionIdGetDatum(xid);
> + outputBytes += sizeof(TransactionId);
> :
> :
> /* ick, but cstring_to_text_with_len works for bytea perfectly
> fine */ values[2] =
> PointerGetDatum(cstring_to_text_with_len(ctx->out->data,
> ctx->out->len));
> + outputBytes += ctx->out->len;
>
> I'm not sure that we should include these values to the new statistics
> since they are not the data produced by output plugins. I think it's
> better to collect the amount of data output plugins produced and get
> it in OutputPluginWrite(). What do you think?
>
Agreed. The values added by the SQL interface, such as the LSN and XID
columns, are not produced by the output plugin, so they should not be
counted as output bytes.
I also agree that output_bytes should be incremented in
OutputPluginWrite(), since that is the common path for both the SQL
and walsender consumers and ctx->out contains the data produced by the
output plugin there.
I have implemented these changes in the attached patch, please have a
look and let me know your thoughts.
> ---
> - elog(DEBUG2, "UpdateDecodingStats: updating stats %p %" PRId64 "
> %" PRId64 " %" PRId64 " %" PRId64 " %" PRId64 " %" PRId64 " %" PRId64
> " %" PRId64 " %" PRId64,
> + elog(DEBUG2, "UpdateDecodingStats: updating stats %p %" PRId64 "
> %" PRId64 " %" PRId64 " %" PRId64 " %" PRId64 " %" PRId64 " %" PRId64
> " %" PRId64 " %" PRId64 " %" PRId64,
>
> This log message is already quite hard to read, and I'm concerned that
> simply dumping ten unlabeled numbers is no longer helpful for
> debugging. If all of these metrics are indeed still useful, I would
> propose adding explicit labels for each one. Otherwise, we might as
> well remove this elog() entirely.
>
I have added labels for each field emitted by this debug message, as I
am not quite sure that removing it completely would be a good idea.
> ---
> @@ -1979,6 +1980,7 @@ UpdateDecodingStats(LogicalDecodingContext *ctx)
> repSlotStat.mem_exceeded_count = rb->memExceededCount;
> repSlotStat.total_txns = rb->totalTxns;
> repSlotStat.total_bytes = rb->totalBytes;
> + repSlotStat.output_bytes = rb->outputBytes;
>
> Since output_bytes is now the tenth field we accumulate for
> replication slot statistics, it seems like a good opportunity to
> improve its maintainability. Instead of adding fields individually, we
> could define a common structure for these shared metrics so that both
> PgStat_StatReplSlotEntry and ReorderBuffer can embed it. This way,
> UpdateDecodingStats() could simply assign the collected struct and
> clear it using MemSet().
>
I agree that this would improve maintainability.
PgStat_StatReplSlotEntry now has 13 fields, and 10 of them are the
decoding counters that are also tracked in ReorderBuffer, so a common
counter-only structure sounds like the right direction.
That said, this is not specific to the output_bytes field being added
here. It is more of a refactoring of the boundary between logical
decoding and pgstat statistics. To keep this patch focused, I think it
would be better to handle that as a separate follow-up patch, possibly
with a new thread once this work is done. Please let me know your
thoughts on this?
--
With Regards,
Ashutosh Sharma.
| Attachment | Content-Type | Size |
|---|---|---|
| v20260630-0001-Report-output-bytes-in-pg_stat_replication_slots.patch | application/octet-stream | 25.4 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ewan Young | 2026-06-30 07:01:31 | JSON_VALUE/JSON_TABLE DEFAULT expression ignores RETURNING typmod |
| Previous Message | vignesh C | 2026-06-30 06:47:26 | Re: Include sequences in publications created by pg_createsubscriber |