| From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
|---|---|
| To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
| Cc: | Jan Wieck <jan(at)wi3ck(dot)info>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Initial COPY of Logical Replication is too slow |
| Date: | 2026-03-25 05:06:54 |
| Message-ID: | CAD21AoBJcxRcaWQot302diaxoDcsnezRhnZa7p8UrPh5AGNeHQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Mar 24, 2026 at 11:57 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Mar 24, 2026 at 3:47 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Thu, Mar 19, 2026 at 4:59 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Mar 18, 2026 at 3:31 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > >
> > >
> > > I've attached the patch to implement this idea. The patch still
> > > introduces a new function but it overloads
> > > pg_get_publication_tables(). We might be able to handle different
> > > input (array or text) in pg_get_publication_tables() better, but it's
> > > enough for discussion at least.
> > >
> >
> > *
> > + /*
> > + * We can pass relid to pg_get_publication_table_info() since
> > + * version 19.
> > + */
> > + appendStringInfo(&cmd,
> > + "SELECT DISTINCT"
> > + " (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)"
> > + " THEN NULL ELSE gpt.attrs END)"
> > + " FROM pg_publication p,"
> > + " LATERAL pg_get_publication_tables(p.pubname, %u) gpt,"
> > + " pg_class c"
> > + " WHERE c.oid = gpt.relid"
> > + " AND p.pubname IN ( %s )",
> > + lrel->remoteid,
> > + pub_names->data);
> >
> > Why in the above query we need a join with pg_publication? Can't we
> > directly pass 'pub_names' and 'relid' to pg_get_publication_tables()
> > to get the required information?
>
> Since the 'pub_names' is the list of publication names we cannot
> directly pass it to the pg_get_publication_tables(). But if we make
> pg_get_publication_tables() take {pubname text[], target_relid oid}
> instead of {pubname text, target_relid oid}, yes. And it seems to help
> somewhat simplify the patch. If having both
> pg_get_publication_tables(VARIADIC text[]) and
> pg_get_publication_tables(text[], oid) is not odd, it would be worth
> trying it.
>
I figured out that the join with pg_publication works as a filter;
non-existence publication names are not passed to the function. If we
pass the list of publication names to the new function signature,
while we can simplify the patch and avoid a join, we would change the
existing function behavior so that it ignores non-existence
publications.
I've attached the updated patch. The 0001 patch just incorporated the
review comments so far, and the 0002 patch is a draft change for the
above idea. Since pg_get_publication_tables(VARIADIC text) is not a
documented function, I think we can accept small behavior changes. So
I'm going to go with this direction. Feedback is very welcome.
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
| Attachment | Content-Type | Size |
|---|---|---|
| v4-0002-POC-pass-the-list-of-publications-to-pg_get_publi.patch | text/x-patch | 21.1 KB |
| v4-0001-Avoid-full-table-scans-when-getting-publication-t.patch | text/x-patch | 26.0 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2026-03-25 05:11:00 | Re: [PATCH v2] Re: Cancel problems of query to pg_stat_statements |
| Previous Message | Michael Paquier | 2026-03-25 05:02:27 | Re: access numeric data in module |