Add support for specifying tables in pg_createsubscriber.

From: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Add support for specifying tables in pg_createsubscriber.
Date: 2025-07-21 05:30:30
Message-ID: CAHv8Rj+sxWutv10WiDEAPZnygaCbuY2RqiLMj2aRMH-H3iZwyA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

Currently, pg_createsubscriber supports converting streaming
replication to logical replication for selected databases or all
databases. However, there is no provision to replicate only a few
selected tables. For such cases, users are forced to manually set up
logical replication using individual SQL commands (CREATE PUBLICATION,
CREATE SUBSCRIPTION, etc.), which can be time-consuming and
error-prone. Extending pg_createsubscriber to support table-level
replication would significantly improve the time taken to perform the
setup.
The attached patch introduces a new '--table' option that can be
specified after each '--database' argument. It allows users to
selectively replicate specific tables within a database instead of
defaulting to all tables. The syntax is like that used in 'vacuumdb'
and supports multiple '--table' arguments per database, including
optional column lists and row filters.
Example usage:
./pg_createsubscriber \ --database db1 \ --table 'public.t1' \ --table
'public.t2(a,b) WHERE a > 100' \ --database db2 \ --table 'public.t3'

I conducted tests comparing the patched pg_createsubscriber with
standard logical replication under various scenarios to assess
performance and flexibility. All test results represent the average of
five runs.

Scenario pg_createsubscriber Logical Replication Improvement

Two databases
(postgres and
db1 each
having 100
tables), replicate
all 100 in
postgres, 50
tables in db1
(100MB/table)
total 15GB data 2m4.823s 7m23.294s 71.85%

One DB, 100
tables, replicate
50 tables
(200 MB/table)
total 10GB data 2m47.703s 4m58.003s 43.73%

One DB, 200
tables, replicate
100 tables
(100 MB/table)
total 10GB data 3m6.476s 4m35.130s 32.22%

One DB, 100
tables, replicate
50 tables
(100MB/table)
total 5GB data 1m54.384s 2m23.719s 20.42%

These results demonstrate that pg_createsubscriber consistently
outperforms standard logical replication by 20.42% for 5GB data to
71.85% for 15GB data, the time taken reduces as the data increases.

The attached test scripts were used for all experiments.
Scenario 1 (Logical replication setup involving 50 tables across 2
databases, each containing 100 tables with 100 MB of data per table):
pg_createsubscriber_setup_multi_db.sh was used for setup, followed by
pg_createsubscriber_test_multi_db.sh to measure performance. For
logical replication, the setup was done using
logical_replication_setup_multi_db.sh, with performance measured via
logical_replication_test_multi_db.sh.

Scenario 2 and 3:
The pg_createsubscriber_setup_single_db.sh (uncomment appropriate
scenario mentioned in comments) script was used, with configuration
changes specific to Scenario 2 and Scenario 3. In both cases,
pg_createsubscriber_test_single_db.sh (uncomment appropriate scenario
mentioned in comments) was used for measuring performance. Logical
replication followed the same pattern, using
logical_replication_setup_single_db.sh (uncomment appropriate scenario
mentioned in comments) and logical_replication_test_single_db.sh
(uncomment appropriate scenario mentioned in comments) for
measurement.

Scenario 4 (Logical replication setup on 50 tables from a database
containing 100 tables, each with 100 MB of data):
pg_createsubscriber_setup_single_db.sh (without modifications) was
used for setup, and pg_createsubscriber_test_single_db.sh (without
modifications) was used for performance measurement. Logical
replication used logical_replication_setup_single_db.sh (without
modifications) for setup and logical_replication_test_single_db.sh
(without modifications) for measurement.

Thoughts?

Thanks and regards,
Shubham Khanna.

Attachment Content-Type Size
v1-0001-Support-tables-via-pg_createsubscriber.patch application/octet-stream 14.1 KB
Scripts.zip application/x-zip-compressed 4.7 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-07-21 05:34:15 Re: Document slot's restart_lsn can go backward
Previous Message shveta malik 2025-07-21 05:18:16 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart