Re: Support logical replication of DDLs

From: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Zheng Li <zhengli10(at)gmail(dot)com>, li jie <ggysxcq(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Support logical replication of DDLs
Date: 2023-02-16 21:08:14
Message-ID: e947fa21-24b2-f922-375a-d4f763ef3e4b@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On 2/16/23 2:43 PM, Jonathan S. Katz wrote:
> On 2/16/23 2:38 PM, Alvaro Herrera wrote:
>> On 2023-Feb-16, Jonathan S. Katz wrote:
>>
>>> On 2/16/23 12:53 PM, Alvaro Herrera wrote:
>>
>>>> I don't think this is the fault of logical replication.  Consider that
>>>> for the backend server, the function source code is just an opaque
>>>> string that is given to the plpgsql engine to interpret.  So there's no
>>>> way for the logical DDL replication engine to turn this into runnable
>>>> code if the table name is not qualified.
>>>
>>> Sure, that's fair. That said, the example above would fall under a
>>> "typical
>>> use case", i.e. I'm replicating functions that call tables without
>>> schema
>>> qualification. This is pretty common, and as logical replication becomes
>>> used for more types of workloads (e.g. high availability), we'll
>>> definitely
>>> see this.
>>
>> Hmm, I think you're saying that replay should turn check_function_bodies
>> off, and I think I agree with that.
>
> Yes, exactly. +1

I drilled into this a bit more using the SQL standard bodies (BEGIN
ATOMIC) to see if there were any other behaviors we needed to account
for. Overall, it worked well but I ran into one issue.

First, functions with "BEGIN ATOMIC" ignores "check_function_bodies"
which is by design based on how this feature works. We should still turn
"check_function_bodies" to "off" though, per above discussion.

In the context of DDL replication, "BEGIN ATOMIC" does support
schema-unqualified functions, presumably because it includes the parsed
content?

I created an updated example[1] where I converted the SQL functions to
use the standard syntax and I returned the table names to be schema
unqualified. This seemed to work, but I ran into a weird case with this
function:

CREATE OR REPLACE FUNCTION public.calendar_manage(room_id int,
calendar_date date)
RETURNS void
LANGUAGE SQL
BEGIN ATOMIC
WITH delete_calendar AS (
DELETE FROM calendar
WHERE
room_id = $1 AND
calendar_date = $2
)
INSERT INTO calendar (room_id, status, calendar_date, calendar_range)
SELECT $1, c.status, $2, c.calendar_range
FROM calendar_generate_calendar($1, tstzrange($2, $2 + 1)) c;
END;

This produced an error on the subscriber, with the following message:

2023-02-16 20:58:24.096 UTC [26864] ERROR: missing FROM-clause entry
for table "calendar_1" at character 322
2023-02-16 20:58:24.096 UTC [26864] CONTEXT: processing remote data for
replication origin "pg_18658" during message type "DDL" in transaction
980, finished at 0/C099A7D8
2023-02-16 20:58:24.096 UTC [26864] STATEMENT: CREATE OR REPLACE
FUNCTION public.calendar_manage ( IN room_id pg_catalog.int4, IN
calendar_date pg_catalog.date ) RETURNS pg_catalog.void LANGUAGE sql
VOLATILE PARALLEL UNSAFE CALLED ON NULL INPUT SECURITY INVOKER COST 100
BEGIN ATOMIC
WITH delete_calendar AS (
DELETE FROM public.calendar
WHERE ((calendar_1.room_id OPERATOR(pg_catalog.=)
calendar_manage.room_id) AND (calendar_1.calendar_date
OPERATOR(pg_catalog.=) calendar_manage.calendar_date))
)
INSERT INTO public.calendar (room_id, status, calendar_date,
calendar_range) SELECT calendar_manage.room_id,
c.status,
calendar_manage.calendar_date,
c.calendar_range
FROM
public.calendar_generate_calendar(calendar_manage.room_id,
pg_catalog.tstzrange((calendar_manage.calendar_date)::timestamp with
time zone, ((calendar_manage.calendar_date OPERATOR(pg_catalog.+)
1))::timestamp with time zone)) c(status, calendar_range);
END

This seemed to add an additional, incorrect reference to the origin
table for the "room_id" and "calendar_date" attributes within the CTE of
this function. I don't know if this is directly related to the DDL
replication patch, but reporting it as I triggered the behavior through it.

Thanks,

Jonathan

[1] https://gist.github.com/jkatz/fe29006b724fd6f32ee849a96dc01608

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ken Tanzer 2023-02-16 22:47:20 Re: How to avoid Trigger ping/pong / infinite loop
Previous Message Adrian Klaver 2023-02-16 19:55:03 Re: How to avoid Trigger ping/pong / infinite loop

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-02-16 21:17:54 Re: recovery modules
Previous Message Nathan Bossart 2023-02-16 20:15:12 Re: recovery modules