Re: logical replication empty transactions

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Ajin Cherian <itsajin(at)gmail(dot)com>
Cc: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>
Subject: Re: logical replication empty transactions
Date: 2021-07-30 05:40:52
Message-ID: CAHut+PuyqcDJO0X2BxY+9ycF+ew3x77FiCbTJQGnLDbNmMASZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Ajin.

I have spent some time studying how your "empty transaction" (v11)
patch will affect network traffic and transaction throughput.

BLUF
====

For my test environment the general observations with the patch applied are:
- There is a potentially large reduction of network traffic (depends
on the number of empty transactions sent)
- Transaction throughput improved up to 7% (average ~2% across
mixtures) for Synchronous mode
- Transaction throughput improved up to 7% (average ~3% across
mixtures) for NOT Synchronous mode

So this patch LGTM.

TEST INFORMATION
================

Overview
-------------

1. There are 2 similar tables. One table is published; the other is not.

2. Equivalent simple SQL operations are performed on these tables. E.g.
- INSERT/UPDATE/DELETE using normal COMMIT
- INSERT/UPDATE/DELETE using 2PC COMMIT PREPARED

3. pg_bench is used to measure the throughput for different mixes of
empty and not-empty transactions sent. E.g.
- 0% are empty
- 25% are empty
- 50% are empty
- 75% are empty
- 100% are empty

4. The apply_dispatch code has been temporarily modified to log the
number of protocol messages/bytes being processed.
- At the conclusion of the test run the logs are processed to extract
the numbers.

5. Each test run is 15 minutes elapsed time.

6. The tests are repeated without/with your patch applied
- So, there are 2 (without/with patch) x 5 (different mixes) = 10 test results
- Transaction throughput results are from pg_bench
- Protocol message bytes are extracted from the logs (from modified
apply_dispatch)

7. Also, the entire set of 10 test cases was repeated with
synchronous_standby_names setting enable/disabled.
- Enabled, so the results are for total round-trip processing of the pub/sub.
- Disabled. no waiting at the publisher side.

Configuration
-------------------

My environment is a single test machine with 2 PG instances (for pub and sub).

Using default configs except:

PUB-node
- wal_level = logical
- max_wal_senders = 10
- logical_decoding_work_mem = 64kB
- checkpoint_timeout = 30min
- min_wal_size = 10GB
- max_wal_size = 20GB
- shared_buffers = 2GB
- synchronous_standby_names = 'sync_sub' (for synchronous testing only)

SUB-node
- max_worker_processes = 11
- max_logical_replication_workers = 10
- checkpoint_timeout = 30min
- min_wal_size = 10GB
- max_wal_size = 20GB
- shared_buffers = 2GB

SQL files
-------------

Contents of test_empty_not_published.sql:

-- Operations for table not published
BEGIN;
INSERT INTO test_tab_nopub VALUES(1, 'foo');
UPDATE test_tab_nopub SET b = 'bar' WHERE a = 1;
DELETE FROM test_tab_nopub WHERE a = 1;
COMMIT;

-- 2PC operations for table not published
BEGIN;
INSERT INTO test_tab_nopub VALUES(2, 'fizz');
UPDATE test_tab_nopub SET b = 'bang' WHERE a = 2;
DELETE FROM test_tab_nopub WHERE a = 2;
PREPARE TRANSACTION 'gid_nopub';
COMMIT PREPARED 'gid_nopub';

~~

Contents of test_empty_published.sql:

(same as above but the table is called test_tab)

SQL Tables
----------------

(tables are the same apart from the name)

CREATE TABLE test_tab (a int primary key, b text, c timestamptz
DEFAULT now(), d bigint DEFAULT 999);

CREATE TABLE test_tab_nopub (a int primary key, b text, c timestamptz
DEFAULT now(), d bigint DEFAULT 999);

Example pg_bench command
------------------------

(this example is showing a test for a 25% mix of empty transactions)

pgbench -s 100 -T 900 -c 1 -f test_empty_not_published(dot)sql(at)5 -f
test_empty_published(dot)sql(at)15 test_pub

RESULTS / OBSERVATIONS
======================

Synchronous Mode
----------------

- As the percentage mix of empty transactions increases, so does the
transaction throughput. I assume this is because we are using
synchronous mode; so when there is less waiting time, then there is
more time available for transaction processing

- The performance was generally similar before/after the patch, but
there was an observed throughput improvement of ~2% (averaged across
all mixes)

- The number of protocol bytes is associated with the number of
transactions that are processed during the test time of 15 minutes.
This adds up to a significant number of bytes even when the
transactions are empty.

- For the unpatched code as the transaction rate increases, then so
does the number of traffic bytes.

- The patch improves this significantly by eliminating all the empty
transaction traffic.

- Before the patch, even "empty transactions" are processing some
bytes, so it can never reach zero. After the patch, empty transaction
traffic is eliminated entirely.

NOT Synchronous Mode
--------------------

- Since there is no synchronous waiting for round trips, the
transaction throughput is generally consistent regardless of the empty
transaction mix.

- There is a hint of a small overall improvement in throughput as the
empty transaction mix approaches near 100%. For my test environment
both the pub/sub nodes are using the same machine/CPU, so I guess is
that when there is less CPU spent processing messages in the Apply
Worker then there is more CPU available to pump transactions at the
publisher side.

- The patch transaction throughput seems ~3% better than for
non-patched. This might also be attributable to the same reason
mentioned above - less CPU spent processing empty messages at the
subscriber side leaves more CPU available to pump transactions from
the publisher side.

- The number of protocol bytes is associated with the number of
transactions that are processed during the test time of 15 minutes.

- Because the transaction throughput is consistent, the traffic of
protocol bytes here is determined mainly by the proportion of "empty
transactions" in the mixture.

- Before the patch, even “empty transactions” are processing some
bytes, so it can never reach zero. After the patch, the empty
transaction traffic is eliminated entirely.

- Before the patch, even “empty transactions” are processing some
bytes, so it can never reach zero. After the patch, the empty
transaction traffic is eliminated entirely.

ATTACHMENTS
===========

PSA

A1. A PDF version of my test report (also includes raw result data)
A2. Sync: Graph of Transaction throughput
A3. Sync: Graph of Protocol bytes (total)
A4. Sync: Graph of Protocol bytes (per transaction)
A5. Not-Sync: Graph of Transaction throughput
A6. Not-Sync: Graph of Protocol bytes (total)
A7. Not-Sync: Graph of Protocol bytes (per transaction)

------
Kind Regards,
Peter Smith.
Fujitsu Australia.

Attachment Content-Type Size
PS-empty-tx-testing-15min.pdf application/pdf 316.5 KB
image/png 19.8 KB
image/png 17.8 KB
image/png 14.9 KB
image/png 18.6 KB
image/png 22.0 KB
image/png 21.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo NAGATA 2021-07-30 05:43:43 Re: Fix around conn_duration in pgbench
Previous Message vignesh C 2021-07-30 05:18:25 Re: [HACKERS] logical decoding of two-phase transactions