Re: Report bytes and transactions actually sent downtream

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, michael(at)paquier(dot)xyz, bertranddrouvot(dot)pg(at)gmail(dot)com, andres(at)anarazel(dot)de, shveta(dot)malik(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Report bytes and transactions actually sent downtream
Date: 2026-06-29 23:01:30
Message-ID: CAD21AoBaP60vGeGoW5Twizf6NKsDD_G4JD2gUHFgX0CywiaM3g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 29, 2026 at 2:08 AM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
>
> Hi Amit,
>
> On Sun, Jun 28, 2026 at 12:26 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Thu, Jun 25, 2026 at 4:37 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Jun 16, 2026 at 2:06 AM Ashutosh Bapat
> > > <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> > > >
> > > >
> > > > Those are logical message types which are part of the logical change
> > > > data - without those messages it's not possible to process the logical
> > > > change data. So they are included. But the keepalive messages, for
> > > > example, aren't part of the logical change data.
> > >
> > > I think logical replication messages like STREAM START/STREAM STOP and
> > > BEGIN/END, versus messages like keepalive and standby/primary status
> > > updates, operate at different layers. The former are the contents of
> > > the logical output and seem to belong naturally to the replication
> > > slot's statistics. I'm not sure the latter should be included as well
> > > -- I'm concerned that counting those bytes would become noise when
> > > analyzing the statistics over time, since they have no relation to the
> > > volume of logical changes.
> > >
> >
> > Both you and Ashutosh seem to make the similar points and sound
> > reasonable, so let's do it that way.
> >
> > > If we do want a statistic showing the literal total bytes sent
> > > downstream including protocol messages, ISTM that should be available
> > > for both logical and physical replication: physical replication also
> > > uses keepalive messages and adds a header to each message. In other
> > > words, that kind of "bytes on the wire" metric isn't really specific
> > > to a logical replication slot, so the slot's statistics don't seem
> > > like the right place for it.
> > >
> > > The proposed column name 'sent_bytes' is also confusing to me, because
> > > I don't think we can call it "total bytes actually sent" in the
> > > logical decoding SQL API case. A name like 'plugin_total_bytes' seems
> > > more straightforward and conveys the intent that protocol messages are
> > > not included.
> > >
> >
> > The SQL API point is genuine and if we display sent_bytes via SQL API
> > then pg_logical_slot_get_changes() will show nonzero sent_bytes even
> > though nothing was ever sent anywhere. OTOH, adding plugin_* prefix
> > also starts to make it sound like stats are plugin specific, how about
> > calling it as 'output_bytes'? It pairs cleanly with the existing
> > column: total_bytes = decoded into the reorder buffer, output_bytes =
> > decoded and handed to the consumer.
> >
> > If we use output_bytes, then we can describe the new stats on the
> > lines of following text in the docs:
> > <para>
> > Amount of decoded data produced for this slot's consumer by the output
> > plugin, after applying any output plugin filters and converting the
> > changes into the output plugin's format. This counts the transaction
> > changes together with the messages that delimit them (such as the
> > begin and commit messages), but not connection-management messages
> > such as keepalives, which are generated by the server rather than the
> > output plugin and are therefore not included.
> > </para>
> > <para>
> > This value can differ from <structfield>total_bytes</structfield>: it
> > may be smaller because filtered changes are not output, or larger
> > because the output plugin's format can be more verbose than the
> > decoded changes. For these reasons
> > <structfield>output_bytes</structfield> is not directly comparable to
> > <structfield>total_bytes</structfield>.
> > </para>
> >
>
> Please find the attached patch that renames sent_bytes to
> output_bytes. Kindly have a look and feel free to share any comments
> or suggestions.

Thank you for updating the patch! The new column name output_bytes
looks good to me.

Here are some comments:

memset(nulls, 0, sizeof(nulls));
values[0] = LSNGetDatum(lsn);
+ outputBytes += sizeof(XLogRecPtr);
values[1] = TransactionIdGetDatum(xid);
+ outputBytes += sizeof(TransactionId);
:
:
/* ick, but cstring_to_text_with_len works for bytea perfectly
fine */ values[2] =
PointerGetDatum(cstring_to_text_with_len(ctx->out->data,
ctx->out->len));
+ outputBytes += ctx->out->len;

I'm not sure that we should include these values to the new statistics
since they are not the data produced by output plugins. I think it's
better to collect the amount of data output plugins produced and get
it in OutputPluginWrite(). What do you think?

---
- elog(DEBUG2, "UpdateDecodingStats: updating stats %p %" PRId64 "
%" PRId64 " %" PRId64 " %" PRId64 " %" PRId64 " %" PRId64 " %" PRId64
" %" PRId64 " %" PRId64,
+ elog(DEBUG2, "UpdateDecodingStats: updating stats %p %" PRId64 "
%" PRId64 " %" PRId64 " %" PRId64 " %" PRId64 " %" PRId64 " %" PRId64
" %" PRId64 " %" PRId64 " %" PRId64,

This log message is already quite hard to read, and I'm concerned that
simply dumping ten unlabeled numbers is no longer helpful for
debugging. If all of these metrics are indeed still useful, I would
propose adding explicit labels for each one. Otherwise, we might as
well remove this elog() entirely.

---
@@ -1979,6 +1980,7 @@ UpdateDecodingStats(LogicalDecodingContext *ctx)
repSlotStat.mem_exceeded_count = rb->memExceededCount;
repSlotStat.total_txns = rb->totalTxns;
repSlotStat.total_bytes = rb->totalBytes;
+ repSlotStat.output_bytes = rb->outputBytes;

Since output_bytes is now the tenth field we accumulate for
replication slot statistics, it seems like a good opportunity to
improve its maintainability. Instead of adding fields individually, we
could define a common structure for these shared metrics so that both
PgStat_StatReplSlotEntry and ReorderBuffer can embed it. This way,
UpdateDecodingStats() could simply assign the collected struct and
clear it using MemSet().

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2026-06-29 23:18:08 Re: uuidv7 improperly accepts dates before 1970-01-01
Previous Message Tomas Vondra 2026-06-29 22:46:13 Re: occasional ECPG failures on dikkop (FreeBSD)