Re: Initial COPY of Logical Replication is too slow

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Initial COPY of Logical Replication is too slow
Date: 2026-03-09 22:09:38
Message-ID: CAD21AoA9YgiY1rVKMPZwB00WU_G4UfzoawY=7hyd7hpvBPcK6w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 3, 2026 at 2:22 AM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Saturday, February 28, 2026 7:48 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > To: Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>
> > Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
> > Subject: Re: Initial COPY of Logical Replication is too slow
> >
> > Another variant of this approach is to extend
> > pg_get_publication_table() so that it can accept a relid to get the publication
> > information of the specific table. I've attached the patch for this idea. I'm
> > going to add regression test cases.
> >
> > pg_get_publication_table() is a VARIACID array function so the patch changes
> > its signature to {text[] [, oid]}, breaking the tool compatibility. Given this
> > function is mostly an internal-use function (we don't have the documentation
> > for it), it would probably be okay with it. I find it's clearer than the other
> > approach of introducing pg_get_publication_table_info(). Feedback is very
> > welcome.
>
> Thanks for updating the patch.
>
> I have few comments for the function change:
>
> 1.
>
> If we change the function signature, will it affect use cases where the
> publisher version is newer and the subscriber version is older ? E.g., when
> publisher is passing text style publication name to pg_get_publication_tables().

Good point.

I noticed that changing the function signature of
pg_get_publication_tables() breaks logical replication setups where
the subscriber is 18 or older. In the latest patch, I've switched the
approach back to the pg_get_publication_table_info() idea.

>
> 2.
>
> In the following example, I expected it to output a table with valid row
> filter, but it returns 0 row after applying the patch.
>
> CREATE TABLE measurements (
> city_id int not null,
> logdate date not null,
> peaktemp int,
> unitsales int
> ) PARTITION BY RANGE (logdate);
>
> -- Create partitions
> CREATE TABLE measurements_2023_q1 PARTITION OF measurements
> FOR VALUES FROM ('2023-01-01') TO ('2023-04-01');
>
> CREATE PUBLICATION pub FOR TABLE measurements_2023_q1 WHERE (city_id = 2);
>
> select pg_get_publication_tables(ARRAY['pub2'], 'measurements_2023_q1'::regclass);
> pg_get_publication_tables
> ---------------------------
> (0 rows)

Thank you for testing the patch. I've fixed it and added regression
tests in the latest patch.

I've attached the updated patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v2-0001-Avoid-full-table-scans-when-getting-publication-t.patch application/x-patch 26.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shihao zhong 2026-03-09 22:11:33 Add missing stats_reset column to pg_stat_database_conflicts view
Previous Message Zsolt Parragi 2026-03-09 21:55:22 Re: Stack-based tracking of per-node WAL/buffer usage