| From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
|---|---|
| To: | Marcos Pegoraro <marcos(at)f10(dot)com(dot)br> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Initial COPY of Logical Replication is too slow |
| Date: | 2026-01-26 20:30:03 |
| Message-ID: | CAD21AoDT3sL2COprsRumM9zEpL1Bk5VWboK4V2mRnjGua8xfeA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, Jan 19, 2026 at 9:44 AM Marcos Pegoraro <marcos(at)f10(dot)com(dot)br> wrote:
>
> Em sex., 19 de dez. de 2025 às 22:59, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> escreveu:
>>
>> Yeah, if we pass a publication that a lot of tables belong to to
>> pg_get_publication_tables(), it could take a long time to return as it
>> needs to construct many entries.
>
>
> Well, I don't know how to help but I'm sure it's working badly.
> Today I added some fields on my server, then seeing logs I could see how slow this process is.
>
> duration: 2213.872 ms statement: SELECT DISTINCT (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts) THEN NULL ELSE gpt.attrs END) FROM pg_publication p, LATERAL pg_get_publication_tables(p.pubname) gpt, pg_class c WHERE gpt.relid = 274376788 AND c.oid = gpt.relid AND p.pubname IN ( 'mypub' )
>
> 2 seconds to get the list of fields of a table is really too slow.
> How can we solve this ?
After more investigation of slowness, it seems that the
list_concat_unique_oid() called below is quite slow when the database
has a lot of tables to publish:
relids = GetPublicationRelations(pub_elem->oid,
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
PUBLICATION_PART_LEAF);
schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
PUBLICATION_PART_LEAF);
pub_elem_tables = list_concat_unique_oid(relids, schemarelids);
This is simply because it's O(n^2), where n is the number of oids in
schemarelids in the test case. A simple change would be to do sort &
dedup instead. With the attached experimental patch, the
pg_get_publication_tables() execution time gets halved in my
environment (796ms -> 430ms with 50k tables). If the number of tables
is not large, this method might be slower than today but it's not a
huge regression.
In the initial tablesync cases, it could be optimized further in a way
that we introduce a new SQL function that gets the column list and
expr of the specific table. This way, we can filter the result by
relid at an early stage instead of getting all information and
filtering by relid as the tablesync worker does today, avoiding
overheads of gathering system catalog scan results.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
| Attachment | Content-Type | Size |
|---|---|---|
| sort_and_dedup.patch | application/octet-stream | 637 bytes |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Masahiko Sawada | 2026-01-26 20:31:07 | Re: pg_upgrade: optimize replication slot caught-up check |
| Previous Message | Álvaro Herrera | 2026-01-26 20:06:11 | Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY |