Re: Replication slot stats misgivings

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: Replication slot stats misgivings
Date: 2021-04-13 05:15:49
Message-ID: CAD21AoAarZeXkzGtHpssO-G4=+5z7zTE7TjidwaN5u+XT+Ktvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 12, 2021 at 9:16 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Mon, Apr 12, 2021 at 4:46 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Sat, Apr 10, 2021 at 6:51 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > >
> >
> > Thanks, 0001 and 0002 look good to me. I have a minor comment for 0002.
> >
> > <entry role="catalog_table_entry"><para role="column_definition">
> > + <structfield>total_bytes</structfield><type>bigint</type>
> > + </para>
> > + <para>
> > + Amount of decoded transactions data sent to the decoding output plugin
> > + while decoding the changes from WAL for this slot. This can be used to
> > + gauge the total amount of data sent during logical decoding.
> >
> > Can we slightly extend it to say something like: Note that this
> > includes the bytes streamed and or spilled. Similarly, we can extend
> > it for total_txns.
> >
>
> Thanks for the comments, the comments are fixed in the v8 patch attached.
> Thoughts?

Here are review comments on new TAP tests:

+# Create replication slots.
+$node->safe_psql('postgres',
+ "SELECT 'init' FROM
pg_create_logical_replication_slot('regression_slot1',
'test_decoding')");
+$node->safe_psql('postgres',
+ "SELECT 'init' FROM
pg_create_logical_replication_slot('regression_slot2',
'test_decoding')");
+$node->safe_psql('postgres',
+ "SELECT 'init' FROM
pg_create_logical_replication_slot('regression_slot3',
'test_decoding')");
+$node->safe_psql('postgres',
+ "SELECT 'init' FROM
pg_create_logical_replication_slot('regression_slot4',
'test_decoding')");

and

+
+$node->safe_psql('postgres',
+ "SELECT data FROM
pg_logical_slot_get_changes('regression_slot1', NULL, NULL,
'include-xids', '0', 'skip-empty-xacts', '1')");
+$node->safe_psql('postgres',
+ "SELECT data FROM
pg_logical_slot_get_changes('regression_slot2', NULL, NULL,
'include-xids', '0', 'skip-empty-xacts', '1')");
+$node->safe_psql('postgres',
+ "SELECT data FROM
pg_logical_slot_get_changes('regression_slot3', NULL, NULL,
'include-xids', '0', 'skip-empty-xacts', '1')");
+$node->safe_psql('postgres',
+ "SELECT data FROM
pg_logical_slot_get_changes('regression_slot4', NULL, NULL,
'include-xids', '0', 'skip-empty-xacts', '1')");

I think we can do those similar queries in a single psql connection
like follows:

# Create replication slots.
$node->safe_psql('postgres',
qq[
SELECT pg_create_logical_replication_slot('regression_slot1', 'test_decoding');
SELECT pg_create_logical_replication_slot('regression_slot2', 'test_decoding');
SELECT pg_create_logical_replication_slot('regression_slot3', 'test_decoding');
SELECT pg_create_logical_replication_slot('regression_slot4', 'test_decoding');
]);

and

$node->safe_psql('postgres',
qq[
SELECT data FROM pg_logical_slot_get_changes('regression_slot1', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
SELECT data FROM pg_logical_slot_get_changes('regression_slot2', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
SELECT data FROM pg_logical_slot_get_changes('regression_slot3', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
SELECT data FROM pg_logical_slot_get_changes('regression_slot4', NULL,
NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
]);

---
+# Wait for the statistics to be updated.
+my $slot1_stat_check_query =
+ "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name
= 'regression_slot1' AND total_txns > 0 AND total_bytes > 0;";
+my $slot2_stat_check_query =
+ "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name
= 'regression_slot2' AND total_txns > 0 AND total_bytes > 0;";
+my $slot3_stat_check_query =
+ "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name
= 'regression_slot3' AND total_txns > 0 AND total_bytes > 0;";
+my $slot4_stat_check_query =
+ "SELECT count(1) = 1 FROM pg_stat_replication_slots WHERE slot_name
= 'regression_slot4' AND total_txns > 0 AND total_bytes > 0;";
+
+# Verify that the statistics have been updated.
+$node->poll_query_until('postgres', $slot1_stat_check_query)
+ or die "Timed out while waiting for statistics to be updated";
+$node->poll_query_until('postgres', $slot2_stat_check_query)
+ or die "Timed out while waiting for statistics to be updated";
+$node->poll_query_until('postgres', $slot3_stat_check_query)
+ or die "Timed out while waiting for statistics to be updated";
+$node->poll_query_until('postgres', $slot4_stat_check_query)
+ or die "Timed out while waiting for statistics to be updated";

We can simplify the above code to something like:

$node->poll_query_until(
'postgres', qq[
SELECT count(slot_name) >= 4
FROM pg_stat_replication_slots
WHERE slot_name ~ 'regression_slot'
AND total_txns > 0
AND total_bytes > 0;
]) or die "Timed out while waiting for statistics to be updated";

---
+# Test to remove one of the replication slots and adjust max_replication_slots
+# accordingly to the number of slots and verify replication statistics data is
+# fine after restart.

I think it's better if we explain in detail what cases we're trying to
test. How about the following description?

Test to remove one of the replication slots and adjust
max_replication_slots accordingly to the number of slots. This leads
to a mismatch of the number of slots between in the stats file and on
shared memory, simulating the message for dropping a slot got lost. We
verify replication statistics data is fine after restart.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kohei KaiGai 2021-04-13 05:22:35 Re: TRUNCATE on foreign table
Previous Message Virender Singla 2021-04-13 04:51:03 Re: vacuum freeze - possible improvements