RE: [PATCH] Support automatic sequence replication

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: [PATCH] Support automatic sequence replication
Date: 2026-03-05 12:57:03
Message-ID: OS9PR01MB12149E526A7FE1202683C5529F57DA@OS9PR01MB12149.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Hackers,

> 05.
>
> LogicalRepSyncSequences() starts the transaction and read sequences every
> time.
> Can we cache the seqinfos to reuse in the next iteration? My idea is to introduce
> a syscache callback for the pg_subscription_relto invalidate the cached list.
>
> How about measuring performance once and considering it's a good
> improvement?

I profiled the sequencesync worker when sequences were less actively updated on
the publisher side. In the actively updated system, copying sequences used most
of the CPU time; thus, we could not observe the effect.

Abstract
--------------
Sequencesync worker spent 20-25% of the working time scanning pg_subscription_rel
in the workload. It's not so large compared with the total CPU time; the worker
can work once per 2 seconds or longer. We may able to consider the optimization
if there are easy ways.

Source
-----------
ea47447 + v8 patch set + attached fix patch.
To simplify the analysis, I extracted the scan part into the function
scan_subscription_relations. No configure options are set at build.

Workload
---------------
Two workloads were tested.

A - profile with no sequence updates

0. Defined 100 sequences on both nodes
1. Built a pub-sub replication system.
2. Attached the sequencesync worker as early as creating the subscription.
3. Waited 10 minutes.

B - profile with 10% sequences updates

0. Defined 100 sequences on both nodes
1. Built a pub-sub replication system.
2. Waited till the initial sync was done. On my env 100s was enough
3. Attached the sequencesync worker
4. Updated 10 sequences per second.
5. Repeat step 4 for 10 minutes.

Result
----------
The attached profiles show the detailed results: noupdate.out corresponds to
workload A, while 10percent_update.out is for workload B.

In both cases, scan_subscription_relations spends more than 20% of their working
time. Notable points are to open sequence relations with the AccessShareLock,
committing the transaction, starting the catalog scan, etc.

workload A:
```
| --20.83%--scan_subscription_relations
| |
| |--10.83%--try_table_open
| | try_relation_open
```

workload B:
```
| --24.01%--scan_subscription_relations
| |
| |--12.52%--try_table_open
| | try_relation_open
```

Consideration
--------------
Based on that, we may be able to cache seqinfos to avoid starting the
transaction and opening the sequence. But we need to introduce a relcache
callback to invalidate the specific entry of the list, not sure it's beneficial
more than the complexity.

Configuration
----------------------
Each node had shared_buffer=1GB, and others had the default GUC values.

Environment
--------------------
CPU: Intel(R) Xeon(R) Platinum 8358P CPU @ 2.60GHz, 4 cores, 1 thread per core
Memory: 15GiB
OS: AlmaLinux 9.7

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
10percent_update.out application/octet-stream 817.2 KB
noupdate.out application/octet-stream 717.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ilia Evdokimov 2026-03-05 12:58:21 Re: V18 change on EXPLAIN ANALYZE
Previous Message Pavel Stehule 2026-03-05 12:54:06 Re: proposal: schema variables