Re: Report bytes and transactions actually sent downtream

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, michael(at)paquier(dot)xyz, bertranddrouvot(dot)pg(at)gmail(dot)com, andres(at)anarazel(dot)de, shveta(dot)malik(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Report bytes and transactions actually sent downtream
Date: 2026-06-30 07:01:06
Message-ID: CAE9k0PnBLCA0ZrzanQVyQCRnfk3rRZqctejCC3Cp9ZnhjMiyvQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you, Masahiko-san, for your review comments. Please find my
responses inline below:

On Tue, Jun 30, 2026 at 4:32 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Mon, Jun 29, 2026 at 2:08 AM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
> >
> Thank you for updating the patch! The new column name output_bytes
> looks good to me.
>
> Here are some comments:
>
> memset(nulls, 0, sizeof(nulls));
> values[0] = LSNGetDatum(lsn);
> + outputBytes += sizeof(XLogRecPtr);
> values[1] = TransactionIdGetDatum(xid);
> + outputBytes += sizeof(TransactionId);
> :
> :
> /* ick, but cstring_to_text_with_len works for bytea perfectly
> fine */ values[2] =
> PointerGetDatum(cstring_to_text_with_len(ctx->out->data,
> ctx->out->len));
> + outputBytes += ctx->out->len;
>
> I'm not sure that we should include these values to the new statistics
> since they are not the data produced by output plugins. I think it's
> better to collect the amount of data output plugins produced and get
> it in OutputPluginWrite(). What do you think?
>

Agreed. The values added by the SQL interface, such as the LSN and XID
columns, are not produced by the output plugin, so they should not be
counted as output bytes.

I also agree that output_bytes should be incremented in
OutputPluginWrite(), since that is the common path for both the SQL
and walsender consumers and ctx->out contains the data produced by the
output plugin there.

I have implemented these changes in the attached patch, please have a
look and let me know your thoughts.

> ---
> - elog(DEBUG2, "UpdateDecodingStats: updating stats %p %" PRId64 "
> %" PRId64 " %" PRId64 " %" PRId64 " %" PRId64 " %" PRId64 " %" PRId64
> " %" PRId64 " %" PRId64,
> + elog(DEBUG2, "UpdateDecodingStats: updating stats %p %" PRId64 "
> %" PRId64 " %" PRId64 " %" PRId64 " %" PRId64 " %" PRId64 " %" PRId64
> " %" PRId64 " %" PRId64 " %" PRId64,
>
> This log message is already quite hard to read, and I'm concerned that
> simply dumping ten unlabeled numbers is no longer helpful for
> debugging. If all of these metrics are indeed still useful, I would
> propose adding explicit labels for each one. Otherwise, we might as
> well remove this elog() entirely.
>

I have added labels for each field emitted by this debug message, as I
am not quite sure that removing it completely would be a good idea.

> ---
> @@ -1979,6 +1980,7 @@ UpdateDecodingStats(LogicalDecodingContext *ctx)
> repSlotStat.mem_exceeded_count = rb->memExceededCount;
> repSlotStat.total_txns = rb->totalTxns;
> repSlotStat.total_bytes = rb->totalBytes;
> + repSlotStat.output_bytes = rb->outputBytes;
>
> Since output_bytes is now the tenth field we accumulate for
> replication slot statistics, it seems like a good opportunity to
> improve its maintainability. Instead of adding fields individually, we
> could define a common structure for these shared metrics so that both
> PgStat_StatReplSlotEntry and ReorderBuffer can embed it. This way,
> UpdateDecodingStats() could simply assign the collected struct and
> clear it using MemSet().
>

I agree that this would improve maintainability.
PgStat_StatReplSlotEntry now has 13 fields, and 10 of them are the
decoding counters that are also tracked in ReorderBuffer, so a common
counter-only structure sounds like the right direction.

That said, this is not specific to the output_bytes field being added
here. It is more of a refactoring of the boundary between logical
decoding and pgstat statistics. To keep this patch focused, I think it
would be better to handle that as a separate follow-up patch, possibly
with a new thread once this work is done. Please let me know your
thoughts on this?

--
With Regards,
Ashutosh Sharma.

Attachment Content-Type Size
v20260630-0001-Report-output-bytes-in-pg_stat_replication_slots.patch application/octet-stream 25.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ewan Young 2026-06-30 07:01:31 JSON_VALUE/JSON_TABLE DEFAULT expression ignores RETURNING typmod
Previous Message vignesh C 2026-06-30 06:47:26 Re: Include sequences in publications created by pg_createsubscriber