Re: Support logical replication of DDLs

From: vignesh C <vignesh21(at)gmail(dot)com>
To: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Zheng Li <zhengli10(at)gmail(dot)com>, li jie <ggysxcq(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Support logical replication of DDLs
Date: 2023-02-17 11:04:22
Message-ID: CALDaNm1CMnLfZRSNUxTZrAZpjKYhx4Pz9vZ-296eLGp8wi8Bzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Fri, 17 Feb 2023 at 02:38, Jonathan S. Katz <jkatz(at)postgresql(dot)org> wrote:
>
> On 2/16/23 2:43 PM, Jonathan S. Katz wrote:
> > On 2/16/23 2:38 PM, Alvaro Herrera wrote:
> >> On 2023-Feb-16, Jonathan S. Katz wrote:
> >>
> >>> On 2/16/23 12:53 PM, Alvaro Herrera wrote:
> >>
> >>>> I don't think this is the fault of logical replication. Consider that
> >>>> for the backend server, the function source code is just an opaque
> >>>> string that is given to the plpgsql engine to interpret. So there's no
> >>>> way for the logical DDL replication engine to turn this into runnable
> >>>> code if the table name is not qualified.
> >>>
> >>> Sure, that's fair. That said, the example above would fall under a
> >>> "typical
> >>> use case", i.e. I'm replicating functions that call tables without
> >>> schema
> >>> qualification. This is pretty common, and as logical replication becomes
> >>> used for more types of workloads (e.g. high availability), we'll
> >>> definitely
> >>> see this.
> >>
> >> Hmm, I think you're saying that replay should turn check_function_bodies
> >> off, and I think I agree with that.
> >
> > Yes, exactly. +1
>
> I drilled into this a bit more using the SQL standard bodies (BEGIN
> ATOMIC) to see if there were any other behaviors we needed to account
> for. Overall, it worked well but I ran into one issue.
>
> First, functions with "BEGIN ATOMIC" ignores "check_function_bodies"
> which is by design based on how this feature works. We should still turn
> "check_function_bodies" to "off" though, per above discussion.
>
> In the context of DDL replication, "BEGIN ATOMIC" does support
> schema-unqualified functions, presumably because it includes the parsed
> content?
>
> I created an updated example[1] where I converted the SQL functions to
> use the standard syntax and I returned the table names to be schema
> unqualified. This seemed to work, but I ran into a weird case with this
> function:
>
> CREATE OR REPLACE FUNCTION public.calendar_manage(room_id int,
> calendar_date date)
> RETURNS void
> LANGUAGE SQL
> BEGIN ATOMIC
> WITH delete_calendar AS (
> DELETE FROM calendar
> WHERE
> room_id = $1 AND
> calendar_date = $2
> )
> INSERT INTO calendar (room_id, status, calendar_date, calendar_range)
> SELECT $1, c.status, $2, c.calendar_range
> FROM calendar_generate_calendar($1, tstzrange($2, $2 + 1)) c;
> END;
>
> This produced an error on the subscriber, with the following message:
>
> 2023-02-16 20:58:24.096 UTC [26864] ERROR: missing FROM-clause entry
> for table "calendar_1" at character 322
> 2023-02-16 20:58:24.096 UTC [26864] CONTEXT: processing remote data for
> replication origin "pg_18658" during message type "DDL" in transaction
> 980, finished at 0/C099A7D8
> 2023-02-16 20:58:24.096 UTC [26864] STATEMENT: CREATE OR REPLACE
> FUNCTION public.calendar_manage ( IN room_id pg_catalog.int4, IN
> calendar_date pg_catalog.date ) RETURNS pg_catalog.void LANGUAGE sql
> VOLATILE PARALLEL UNSAFE CALLED ON NULL INPUT SECURITY INVOKER COST 100
> BEGIN ATOMIC
> WITH delete_calendar AS (
> DELETE FROM public.calendar
> WHERE ((calendar_1.room_id OPERATOR(pg_catalog.=)
> calendar_manage.room_id) AND (calendar_1.calendar_date
> OPERATOR(pg_catalog.=) calendar_manage.calendar_date))
> )
> INSERT INTO public.calendar (room_id, status, calendar_date,
> calendar_range) SELECT calendar_manage.room_id,
> c.status,
> calendar_manage.calendar_date,
> c.calendar_range
> FROM
> public.calendar_generate_calendar(calendar_manage.room_id,
> pg_catalog.tstzrange((calendar_manage.calendar_date)::timestamp with
> time zone, ((calendar_manage.calendar_date OPERATOR(pg_catalog.+)
> 1))::timestamp with time zone)) c(status, calendar_range);
> END
>
> This seemed to add an additional, incorrect reference to the origin
> table for the "room_id" and "calendar_date" attributes within the CTE of
> this function. I don't know if this is directly related to the DDL
> replication patch, but reporting it as I triggered the behavior through it.

I had analyzed this issue and found that this issue exists with
getting query definition, I could reproduce the issue with pg_dump and
pg_get_function_sqlbody. I have started a new thread for this at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm1MMntjmT_NJGp-Z%3DxbF02qHGAyuSHfYHias3TqQbPF2w%40mail.gmail.com

Regards,
Vignesh

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Marc Millas 2023-02-17 13:23:07 Re: Postgres undeterministically uses a bad plan, how to convince it otherwise?
Previous Message cen 2023-02-17 10:30:03 Re: Postgres undeterministically uses a bad plan, how to convince it otherwise?

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2023-02-17 12:02:46 Re: ExecRTCheckPerms() and many prunable partitions (checkAsUser)
Previous Message Pavel Luzanov 2023-02-17 11:02:01 Re: psql: Add role's membership options to the \du+ command