Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, "Wei Wang (Fujitsu)" <wangw(dot)fnst(at)fujitsu(dot)com>, "Yu Shi (Fujitsu)" <shiy(dot)fnst(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Date: 2023-08-11 10:56:26
Message-ID: CALDaNm2k2z3Hpa3Omb_tpxWkyHnUvsWjJMbqDs-2uD2eLzemJQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 9 Aug 2023 at 09:51, vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> Hi Melih,
>
> Here is a patch to help in getting the execution at various phases
> like: a) replication slot creation time, b) Wal reading c) Number of
> WAL records read d) subscription relation state change etc
> Couple of observation while we tested with this patch:
> 1) We noticed that the patch takes more time for finding the decoding
> start point.
> 2) Another observation was that the number of XLOG records read for
> identify the consistent point was significantly high with the v26_0001
> patch.
>
> HEAD
> postgres=# select avg(counttime)/1000 "avgtime(ms)",
> median(counttime)/1000 "median(ms)", min(counttime)/1000
> "mintime(ms)", max(counttime)/1000 "maxtime(ms)", logtype from test
> group by logtype;
> avgtime(ms) | median(ms) | mintime(ms) |
> maxtime(ms) | logtype
> ------------------------+------------------------+-------------+-------------+--------------------------
> 0.00579245283018867920 | 0.00200000000000000000 | 0 |
> 1 | SNAPSHOT_BUILD
> 1.2246811320754717 | 0.98550000000000000000 | 0 |
> 37 | LOGICAL_SLOT_CREATION
> 171.0863283018867920 | 183.9120000000000000 | 0 |
> 408 | FIND_DECODING_STARTPOINT
> 2.0699433962264151 | 1.4380000000000000 | 1 |
> 49 | INIT_DECODING_CONTEXT
> (4 rows)
>
> HEAD + v26-0001 patch
> postgres=# select avg(counttime)/1000 "avgtime(ms)",
> median(counttime)/1000 "median(ms)", min(counttime)/1000
> "mintime(ms)", max(counttime)/1000 "maxtime(ms)", logtype from test
> group by logtype;
> avgtime(ms) | median(ms) | mintime(ms) |
> maxtime(ms) | logtype
> ------------------------+------------------------+-------------+-------------+--------------------------
> 0.00588113207547169810 | 0.00500000000000000000 | 0 |
> 0 | SNAPSHOT_BUILD
> 1.1270962264150943 | 1.1000000000000000 | 0 |
> 2 | LOGICAL_SLOT_CREATION
> 301.1745528301886790 | 410.4870000000000000 | 0 |
> 427 | FIND_DECODING_STARTPOINT
> 1.4814660377358491 | 1.4530000000000000 | 1 |
> 9 | INIT_DECODING_CONTEXT
> (4 rows)
>
> In the above FIND_DECODING_STARTPOINT is very much higher with V26-0001 patch.
>
> HEAD
> FIND_DECODING_XLOG_RECORD_COUNT
> - average = 2762
> - median = 3362
>
> HEAD + reuse worker patch(v26_0001 patch)
> Where FIND_DECODING_XLOG_RECORD_COUNT
> - average = 4105
> - median = 5345
>
> Similarly Number of xlog records read is higher with v26_0001 patch.
>
> Steps to calculate the timing:
> -- first collect the necessary LOG from subscriber's log.
> cat *.log | grep -E
> '(LOGICAL_SLOT_CREATION|INIT_DECODING_CONTEXT|FIND_DECODING_STARTPOINT|SNAPSHOT_BUILD|FIND_DECODING_XLOG_RECORD_COUNT|LOGICAL_XLOG_READ|LOGICAL_DECODE_PROCESS_RECORD|LOGICAL_WAIT_TRANSACTION)'
> > grep.dat
>
> create table testv26(logtime varchar, pid varchar, level varchar,
> space varchar, logtype varchar, counttime int);
> -- then copy these datas into db table to count the avg number.
> COPY testv26 FROM '/home/logs/grep.dat' DELIMITER ' ';
>
> -- Finally, use the SQL to analyze the data:
> select avg(counttime)/1000 "avgtime(ms)", logtype from testv26 group by logtype;
>
> --- To get the number of xlog records read:
> select avg(counttime) from testv26 where logtype
> ='FIND_DECODING_XLOG_RECORD_COUNT' and counttime != 1;
>
> Thanks to Peter and Hou-san who helped in finding these out. We are
> parallely analysing this, @Melih Mutlu posting this information so
> that it might help you too in analysing this issue.

I analysed further on why it needs to read a larger number of XLOG
records in some cases while creating the replication slot, here are my
thoughts:
Note: Tablesync worker needs to connect to the publisher and create
consistent point for the slots by reading the XLOG records. This
requires that all the open transactions and the transactions that are
created while creating consistent point should be committed.
I feel the creation of slots is better in few cases in Head because:
Publisher | Subscriber
------------------------------------------------------------
Begin txn1 transaction |
Insert 1..1000 records |
Commit |
Begin txn2 transaction |
Insert 1..1000 records | Apply worker applies transaction txn1
| Start tablesync table t2
| create consistent point in
| publisher before transaction txn3 is
| started
commit | We just need to wait till
| transaction txn2 is finished.
Begin txn3 transaction |
Insert 1..1000 records |
commit |

In V26, this is happening in some cases:
Publisher | Subscriber
------------------------------------------------------------
Begin txn1 transaction |
Insert 1..1000 records |
Commit |
Begin txn2 transaction |
Insert 1..1000 records | Apply worker applies transaction txn1
| Start tablesync table t2
commit | Create consistent point
Begin txn3 transaction | (since transaction txn2 is committed
| and txn3 is started, we will
| need to wait
| for transaction txn3 to be
| committed)
Insert 1..1000 records |
commit |

This is because In HEAD the tablesync worker will be started after one
commit, so we are able to create the consistent point before a new
transaction is started in some cases.
Create slot will be fastest if the tablesync worker is able to connect
to the publisher and create a consistent point before the new
transaction is started. The probability of this is better in HEAD for
this scenario as the new tablesync worker is started after commit and
the tablesync worker in HEAD has a better time window(because the
current transaction has just started) before another new transaction
is started. This probability is slightly lower with the V26 version.
I felt this issue is purely a timing issue in a few cases because of
the timing of the new transactions being created while creating the
slot.

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-08-11 11:13:35 Re: Adding a LogicalRepWorker type field
Previous Message Peter Smith 2023-08-11 10:10:35 Re: Adding a LogicalRepWorker type field