Re: Replication slot stats misgivings

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: Replication slot stats misgivings
Date: 2021-04-14 02:34:06
Message-ID: CALDaNm3=pUenmMxDUW5M6MD8P05nBq0H4+oCFjCjes+GGymGEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 14, 2021 at 7:52 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Apr 13, 2021 at 5:07 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > On Mon, Apr 12, 2021 at 7:03 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Mon, Apr 12, 2021 at 9:36 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > On Mon, Apr 12, 2021 at 5:29 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > >
> > > > > On Mon, Apr 12, 2021 at 8:08 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > > >
> > > > > > On Mon, Apr 12, 2021 at 4:34 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > > > >
> > > > > > > On Mon, Apr 12, 2021 at 6:19 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Apr 12, 2021 at 10:27 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > > > > > >
> > > > > > > > > On Sat, Apr 10, 2021 at 9:53 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > It seems Vignesh has changed patches based on the latest set of
> > > > > > > > > > comments so you might want to rebase.
> > > > > > > > >
> > > > > > > > > I've merged my patch into the v6 patch set Vignesh submitted.
> > > > > > > > >
> > > > > > > > > I've attached the updated version of the patches. I didn't change
> > > > > > > > > anything in the patch that changes char[NAMEDATALEN] to NameData (0001
> > > > > > > > > patch) and patches that add tests.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I think we can push 0001. What do you think?
> > > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > >
> > > > > > > > > In 0003 patch I reordered the
> > > > > > > > > output parameters of pg_stat_replication_slots; showing total number
> > > > > > > > > of transactions and total bytes followed by statistics for spilled and
> > > > > > > > > streamed transactions seems appropriate to me.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I am not sure about this because I think we might want to add some
> > > > > > > > info of stream/spill bytes in total_bytes description (something like
> > > > > > > > stream/spill bytes are not in addition to total_bytes).
> > > > >
> > > > > BTW doesn't it confuse users that stream/spill bytes are not in
> > > > > addition to total_bytes? User will need to do "total_bytes +
> > > > > spill/stream_bytes" to know the actual total amount of data sent to
> > > > > the decoding output plugin, is that right?
> > > > >
> > > >
> > > > No, total_bytes includes the spill/stream bytes. So, the user doesn't
> > > > need to do any calculation to compute totel_bytes sent to output
> > > > plugin.
> > >
> > > The following test for the latest v8 patch seems to show different.
> > > total_bytes is 1808 whereas spill_bytes is 13200000. Am I missing
> > > something?
> > >
> > > postgres(1:85969)=# select pg_create_logical_replication_slot('s',
> > > 'test_decoding');
> > > pg_create_logical_replication_slot
> > > ------------------------------------
> > > (s,0/1884468)
> > > (1 row)
> > >
> > > postgres(1:85969)=# create table a (i int);
> > > CREATE TABLE
> > > postgres(1:85969)=# insert into a select generate_series(1, 100000);
> > > INSERT 0 100000
> > > postgres(1:85969)=# set logical_decoding_work_mem to 64;
> > > SET
> > > postgres(1:85969)=# select * from pg_stat_replication_slots ;
> > > slot_name | total_txns | total_bytes | spill_txns | spill_count |
> > > spill_bytes | stream_txns | stream_count | stream_bytes | stats_reset
> > > -----------+------------+-------------+------------+-------------+-------------+-------------+--------------+--------------+-------------
> > > s | 0 | 0 | 0 | 0 |
> > > 0 | 0 | 0 | 0 |
> > > (1 row)
> > >
> > > postgres(1:85969)=# select count(*) from
> > > pg_logical_slot_peek_changes('s', NULL, NULL);
> > > count
> > > --------
> > > 100004
> > > (1 row)
> > >
> > > postgres(1:85969)=# select * from pg_stat_replication_slots ;
> > > slot_name | total_txns | total_bytes | spill_txns | spill_count |
> > > spill_bytes | stream_txns | stream_count | stream_bytes | stats_reset
> > > -----------+------------+-------------+------------+-------------+-------------+-------------+--------------+--------------+-------------
> > > s | 2 | 1808 | 1 | 202 |
> > > 13200000 | 0 | 0 | 0 |
> > > (1 row)
> > >
> >
> > Thanks for identifying this issue, while spilling the transactions
> > reorder buffer changes gets released, we will not be able to get the
> > total size for spilled transactions from reorderbuffer size. I have
> > fixed it by including spilledbytes to totalbytes in case of spilled
> > transactions. Attached patch has the fix for this.
> > Thoughts?
>
> I've not looked at the patches yet but as Amit mentioned before[1],
> it's better to move 0002 patch to after 0004. That is, 0001 patch
> changes data type to NameData, 0002 patch adds total_txn and
> total_bytes, and 0003 patch adds regression tests. 0004 patch will be
> the patch using HTAB (was 0002 patch) and get reviewed after pushing
> 0001, 0002, and 0003 patches. 0005 patch adds more regression tests
> for the problem 0004 patch addresses.

I will make the change for this and post a patch for this.
Currently we have kept total_txns and total_bytes at the beginning of
pg_stat_replication_slots, I did not see any conclusion on this. I
preferred it to be at the beginning.
Thoughts?

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Japin Li 2021-04-14 02:38:13 Re: Truncate in synchronous logical replication failed
Previous Message Craig Ringer 2021-04-14 02:23:51 Re: [PATCH] Identify LWLocks in tracepoints