Re: Initial COPY of Logical Replication is too slow

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Initial COPY of Logical Replication is too slow
Date: 2026-02-26 08:22:22
Message-ID: DDAEF936-A1F2-4619-B272-6290219B6BEC@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Feb 26, 2026, at 03:03, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Mon, Jan 26, 2026 at 12:30 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>
>> On Mon, Jan 19, 2026 at 9:44 AM Marcos Pegoraro <marcos(at)f10(dot)com(dot)br> wrote:
>>>
>>> Em sex., 19 de dez. de 2025 às 22:59, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> escreveu:
>>>>
>>>> Yeah, if we pass a publication that a lot of tables belong to to
>>>> pg_get_publication_tables(), it could take a long time to return as it
>>>> needs to construct many entries.
>>>
>>>
>>> Well, I don't know how to help but I'm sure it's working badly.
>>> Today I added some fields on my server, then seeing logs I could see how slow this process is.
>>>
>>> duration: 2213.872 ms statement: SELECT DISTINCT (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts) THEN NULL ELSE gpt.attrs END) FROM pg_publication p, LATERAL pg_get_publication_tables(p.pubname) gpt, pg_class c WHERE gpt.relid = 274376788 AND c.oid = gpt.relid AND p.pubname IN ( 'mypub' )
>>>
>>> 2 seconds to get the list of fields of a table is really too slow.
>>> How can we solve this ?
>>
>> After more investigation of slowness, it seems that the
>> list_concat_unique_oid() called below is quite slow when the database
>> has a lot of tables to publish:
>>
>> relids = GetPublicationRelations(pub_elem->oid,
>> pub_elem->pubviaroot ?
>> PUBLICATION_PART_ROOT :
>> PUBLICATION_PART_LEAF);
>> schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
>> pub_elem->pubviaroot ?
>> PUBLICATION_PART_ROOT :
>> PUBLICATION_PART_LEAF);
>> pub_elem_tables = list_concat_unique_oid(relids, schemarelids);
>>
>> This is simply because it's O(n^2), where n is the number of oids in
>> schemarelids in the test case. A simple change would be to do sort &
>> dedup instead. With the attached experimental patch, the
>> pg_get_publication_tables() execution time gets halved in my
>> environment (796ms -> 430ms with 50k tables). If the number of tables
>> is not large, this method might be slower than today but it's not a
>> huge regression.
>>
>> In the initial tablesync cases, it could be optimized further in a way
>> that we introduce a new SQL function that gets the column list and
>> expr of the specific table. This way, we can filter the result by
>> relid at an early stage instead of getting all information and
>> filtering by relid as the tablesync worker does today, avoiding
>> overheads of gathering system catalog scan results.
>
> I've drafted this idea and I find it looks like a better approach. The
> patch introduces the pg_get_publication_table_info() SQL function that
> returns the column list and row filter expression like
> pg_get_publication_tables() returns but it checks only the specific
> table unlike pg_get_publication_tables(). On my env, the tablesync
> worker's query in question becomes 0.6ms from 288 ms with 50k tables
> in one publication. Feedback is very welcome.
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com
> <0001-Add-pg_get_publication_table_info-to-optimize-logica.patch>

A few comments:

1. pg_publication.c needs to pg_indent. When I ran pg_indent agains it, the patch code got a lot format changes.

2
```
+ values[0] = ObjectIdGetDatum(pub->oid);
+ values[1] = ObjectIdGetDatum(relid);
+
+ values[0] = ObjectIdGetDatum(pub->oid);
+ values[1] = ObjectIdGetDatum(relid);
```

Looks like a copy-pasto.

3 I think we can optimize pg_get_publication_table_info() a little bit for non-publish tables by setting funcctx->max_calls = 0. I tried this code locally, and test still passed:
```
if (publish)
{
pubrel = palloc_object(published_rel);
pubrel->relid = relid;
pubrel->pubid = pub->oid;
funcctx->tuple_desc = BlessTupleDesc(tupdesc);
funcctx->user_fctx = pubrel;
funcctx->max_calls = 1; /* return one row */
}
else
funcctx->max_calls = 0; /* return no rows */

MemoryContextSwitchTo(oldcontext);
}

/* stuff done on every call of the function */
funcctx = SRF_PERCALL_SETUP();

if (funcctx->call_cntr < funcctx->max_calls)
{
HeapTuple rettuple;

table_info = (published_rel *) funcctx->user_fctx;
rettuple = construct_published_rel_tuple(table_info, funcctx->tuple_desc);

SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(rettuple));
}

SRF_RETURN_DONE(funcctx);
```

4
```
+ int i;
+
+ attnums = palloc_array(int16, desc->natts);
+
+ for (i = 0; i < desc->natts; i++)
```

Nit: Peter (E) ever did some cleanup of changing for loop variable into for. So I think that style is more preferred now:
```
for (int i = 0; i < desc->natts; i++)
```

5
```
+ attnums = palloc_array(int16, desc->natts);
```

Nit: attnums array is never free-ed, is that intentional? Actually, I’m kinda lost when a memory should be free-ed and when not.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message SATYANARAYANA NARLAPURAM 2026-02-26 08:23:45 Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication
Previous Message Anthonin Bonnefoy 2026-02-26 08:19:09 Re: Propagate XLogFindNextRecord error to callers