Re: Initial COPY of Logical Replication is too slow

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Initial COPY of Logical Replication is too slow
Date: 2026-01-26 20:30:03
Message-ID: CAD21AoDT3sL2COprsRumM9zEpL1Bk5VWboK4V2mRnjGua8xfeA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 19, 2026 at 9:44 AM Marcos Pegoraro <marcos(at)f10(dot)com(dot)br> wrote:
>
> Em sex., 19 de dez. de 2025 às 22:59, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> escreveu:
>>
>> Yeah, if we pass a publication that a lot of tables belong to to
>> pg_get_publication_tables(), it could take a long time to return as it
>> needs to construct many entries.
>
>
> Well, I don't know how to help but I'm sure it's working badly.
> Today I added some fields on my server, then seeing logs I could see how slow this process is.
>
> duration: 2213.872 ms statement: SELECT DISTINCT (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts) THEN NULL ELSE gpt.attrs END) FROM pg_publication p, LATERAL pg_get_publication_tables(p.pubname) gpt, pg_class c WHERE gpt.relid = 274376788 AND c.oid = gpt.relid AND p.pubname IN ( 'mypub' )
>
> 2 seconds to get the list of fields of a table is really too slow.
> How can we solve this ?

After more investigation of slowness, it seems that the
list_concat_unique_oid() called below is quite slow when the database
has a lot of tables to publish:

relids = GetPublicationRelations(pub_elem->oid,
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
PUBLICATION_PART_LEAF);
schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
PUBLICATION_PART_LEAF);
pub_elem_tables = list_concat_unique_oid(relids, schemarelids);

This is simply because it's O(n^2), where n is the number of oids in
schemarelids in the test case. A simple change would be to do sort &
dedup instead. With the attached experimental patch, the
pg_get_publication_tables() execution time gets halved in my
environment (796ms -> 430ms with 50k tables). If the number of tables
is not large, this method might be slower than today but it's not a
huge regression.

In the initial tablesync cases, it could be optimized further in a way
that we introduce a new SQL function that gets the column list and
expr of the specific table. This way, we can filter the result by
relid at an early stage instead of getting all information and
filtering by relid as the tablesync worker does today, avoiding
overheads of gathering system catalog scan results.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
sort_and_dedup.patch application/octet-stream 637 bytes

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2026-01-26 20:31:07 Re: pg_upgrade: optimize replication slot caught-up check
Previous Message Álvaro Herrera 2026-01-26 20:06:11 Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY