Re: Support logical replication of DDLs

From: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Zheng Li <zhengli10(at)gmail(dot)com>, li jie <ggysxcq(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Support logical replication of DDLs
Date: 2023-02-16 16:48:22
Message-ID: 263bea1c-a897-417d-3765-ba6e1e24711e@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Hi,

On 2/14/23 10:01 PM, houzj(dot)fnst(at)fujitsu(dot)com wrote:

> Here is the new version patch which addressed above comments.
> I also fixed a bug for the deparsing of CREATE RULE that it didn't add
> parentheses for rule action list.

I started testing this change set from this patch. I'm doing a mix of
happy path, "making mistakes" path, and "real world" testing, and
testing this both with unidirectional and "origin=none" replication.

I wanted to report an issue I came up with using one of my real world
cases. I had previously built a demo scheduling app to demonstrate
several features of PostgreSQL to help with various kinds of data
synchronization[1]. The first example uses a series of functions and
triggers[2] to keep a calendar table up-to-date.

I set up an experiment as such:

1. Create two different clusters. In each cluster, create a DB
2. On Cluster 1, run:

CREATE PUBLICATION ddl FOR ALL TABLES WITH (ddl='all');

3. On Cluster 2, run:

CREATE SUBSCRIPTION ddl CONNECTION '' PUBLICATION ddl;

4. On Cluster 1, run the commands in [2]. Note that I reproduced the
error both by running the commands individually and as part of a single
transaction.

5. The transactions (or single transaction) completes successfully on
Cluster 1

5. Cluster 2 reports the following error:

2023-02-16 16:11:10.537 UTC [25207] LOG: logical replication apply
worker for subscription "ddl" has started
2023-02-16 16:11:10.570 UTC [25207] ERROR: relation "availability" does
not exist at character 279
2023-02-16 16:11:10.570 UTC [25207] CONTEXT: processing remote data for
replication origin "pg_16733" during message type "DDL" in transaction
890, finished at 0/BF298CC0
2023-02-16 16:11:10.570 UTC [25207] STATEMENT: CREATE OR REPLACE
FUNCTION public.availability_rule_bulk_insert ( IN availability_rule
public.availability_rule, IN day_of_week pg_catalog.int4 ) RETURNS
pg_catalog.void LANGUAGE sql VOLATILE PARALLEL UNSAFE CALLED ON NULL
INPUT SECURITY INVOKER COST 100 AS $_$
INSERT INTO availability (
room_id,
availability_rule_id,
available_date,
available_range
)
SELECT
$1.room_id,
$1.id,
available_date::date + $2 - 1,
tstzrange(
/** start of range */
(available_date::date + $2 - 1) + $1.start_time,
/** end of range */
/** check if there is a time wraparound, if so, increment
by a day */
CASE $1.end_time <= $1.start_time
WHEN TRUE THEN (available_date::date + $2) + $1.end_time
ELSE (available_date::date + $2 - 1) + $1.end_time
END
)
FROM
generate_series(
date_trunc('week', CURRENT_DATE),
date_trunc('week', CURRENT_DATE) +
($1.generate_weeks_into_future::text || ' weeks')::interval,
'1 week'::interval
) available_date;
$_$
2023-02-16 16:11:10.573 UTC [15348] LOG: background worker "logical
replication worker" (PID 25207) exited with exit code 1

I attempted this with both async and sync logical replication. In sync
mode, the publisher hangs and is unable to accept any more writes.

When I went in and explicitly schema qualified the tables in the
functions[3], the example executed successfully.

My high level guess without looking at the code is that the apply worker
is not aware of the search_path to use when processing functions during
creation. Provided that the publisher/subscriber environments are
similar (if not identical), I would expect that if the function create
succeeds on the publisher, it should also succeed on the subscriber.

Thanks,

Jonathan

[1] https://github.com/CrunchyData/postgres-realtime-demo
[2]
https://github.com/CrunchyData/postgres-realtime-demo/blob/main/examples/demo/demo1.sql
[3] https://gist.github.com/jkatz/5655c10da1a4c8691094e951ea07b036

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2023-02-16 16:55:28 Re: DELETE trigger, direct or indirect?
Previous Message Adrian Klaver 2023-02-16 16:46:27 Re: DELETE trigger, direct or indirect?

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-02-16 16:57:59 Re: Weird failure with latches in curculio on v15
Previous Message Andres Freund 2023-02-16 16:44:08 Re: [PoC] Improve dead tuple storage for lazy vacuum