Re: Initial COPY of Logical Replication is too slow

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Initial COPY of Logical Replication is too slow
Date: 2026-03-31 17:28:11
Message-ID: CAD21AoCLvb37vW0pE2eG0=aD0ifTTX-ruBJB8QHYbcMynUQPdQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 31, 2026 at 2:36 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Mar 25, 2026 at 2:19 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Hi Swada-San. Here are some minor review comments for v4-0001/2 combined.
> >
> > ======
> > src/backend/catalog/pg_publication.c
> >
> > is_table_publishable_in_publication:
> >
> > 1.
> > This function logic has a format like
> >
> > if (cond)
> > {
> > ...
> > return;
> > }
> >
> > if (cond2)
> > {
> > ...
> > return;
> > }
> >
> > etc.
> >
> > There are many return points, and most of those "if" blocks cannot
> > fall through (they return).
> >
> > I found it slightly difficult to read the code because I kept having
> > to think, "OK, if we reached here, it means pubviaroot must be false,"
> > or "OK, if we reached this far, then puballtables must be false, and
> > pubviaroot must be false," etc.
> >
>
> I can't say exactly why, but I find it difficult to read this
> function. So, I share your concerns about the code of this function.
> Because of its complexity it is difficult to ascertain that the
> functionality is correct or we missed something. Also, considering it
> is correct today, in its current form, it may become difficult to
> enhance it in future.

Okay, I'll refactor that function.

>
> One more comment on latest patch:
> *
> +static Datum
> +pg_get_publication_tables(FunctionCallInfo fcinfo, ArrayType *pubnames,
> + Oid target_relid, bool filter_by_relid,
>
> Why do we need filter_by_relid as a separate parameter? Isn't the
> valid value of target_relid the same? If so, can't we use target_relid
> for the required checks?

If we don't have filter_by_relid, we would end up not filtering
anything if users pass 0 (InvalidOid) as the target_relid to the new
pg_get_publication_tables(). This is the same as the behavior of the
existing pg_get_publication_tables(), so I'm concerned that it could
be confusing that the function behaves the same even though passing
different arguments . We can check whether the given target_relid is
valid in pg_get_publication_b() but we would end up checking it
multiple times unnecessarily.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2026-03-31 17:33:20 Re: Exit walsender before confirming remote flush in logical replication
Previous Message Joao Foltran 2026-03-31 17:25:51 Re: [BUG] [PATCH] Allow physical replication slots to recover from archive after invalidation