| From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
|---|---|
| To: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
| Cc: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Initial COPY of Logical Replication is too slow |
| Date: | 2026-03-31 19:29:37 |
| Message-ID: | CAD21AoC1wHPESMoNvef+3_+G8assoi966Adm0AiYcz+YW9RAmQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Mar 31, 2026 at 4:39 AM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Dear Sawada-san,
>
> Thanks for updating the patch! Few comments.
Thank you for the comments!
>
> 01.
> ```
> +/*
> + * Similar to is_publishable_calss() but checks whether the given OID
> + * is a publishable "table" or not.
> + */
> +static bool
> +is_publishable_table(Oid tableoid)
> ```
>
> s/is_publishable_calss/is_publishable_class/.
>
> 02.
> ```
> + ReleaseSysCache(tuple);
> + return true;
> ```
>
> Is it correct? I expected to return false here.
>
> 03.
> ```
> + /*
> + * Preliminary check if the specified table can be published in the
> + * first place. If not, we can return early without checking the given
> + * publications and the table.
> + */
> + if (filter_by_relid && !is_publishable_table(target_relid))
> + SRF_RETURN_DONE(funcctx);
> ```
>
> I think we must switch to the old context.
>
> 04.
> ```
> +CREATE PUBLICATION pub_all_except FOR ALL TABLES EXCEPT TABLE (tbl_parent, gpt_test_sch.tbl_sch) WITH (publish_via_partition_root = false);
> +CREATE PUBLICATION pub_all_except_no_viaroot FOR ALL TABLES EXCEPT TABLE (tbl_parent, gpt_test_sch.tbl_sch) WITH (publish_via_partition_root = true);
> ```
>
> It needs to be rebased due to 5984ea86.
Agreed with the all above points. I'll fix them in the next version patch.
>
> 05.
> ```
> CREATE VIEW pg_publication_tables AS
> SELECT
> P.pubname AS pubname,
> N.nspname AS schemaname,
> C.relname AS tablename,
> ( SELECT array_agg(a.attname ORDER BY a.attnum)
> FROM pg_attribute a
> WHERE a.attrelid = GPT.relid AND
> a.attnum = ANY(GPT.attrs)
> ) AS attnames,
> pg_get_expr(GPT.qual, GPT.relid) AS rowfilter
> FROM pg_publication P,
> LATERAL pg_get_publication_tables(P.pubname) GPT,
> pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)
> WHERE C.oid = GPT.relid;
> ```
>
> Can we use the new API of pg_get_publication_tables() here? Below change can pass
> tests on my env.
>
> ```
> - LATERAL pg_get_publication_tables(P.pubname) GPT,
> - pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)
> - WHERE C.oid = GPT.relid;
> + pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace),
> + LATERAL pg_get_publication_tables(ARRAY[P.pubname], C.oid) GPT;
> ```
I'm not sure the new API of pg_get_publication_tables() is better here
since this view is going to get the publication information of all
published tables, in which case the existing one might be faster.
Also, if a few tables among a huge number of tables (or whatever
relations) are published, checking all relations with the new API of
pg_get_publication_tables() would be quite slow.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Melanie Plageman | 2026-03-31 19:31:39 | Re: AIO / read stream heuristics adjustments for index prefetching |
| Previous Message | Melanie Plageman | 2026-03-31 19:07:32 | Re: Don't synchronously wait for already-in-progress IO in read stream |