Re: Support logical replication of DDLs

From: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Zheng Li <zhengli10(at)gmail(dot)com>, li jie <ggysxcq(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Support logical replication of DDLs
Date: 2023-02-19 02:20:31
Message-ID: 831d8428-7a62-3bc6-4e3b-93458469f6a7@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On 2/17/23 4:15 AM, Amit Kapila wrote:
> On Fri, Feb 17, 2023 at 1:13 AM Jonathan S. Katz <jkatz(at)postgresql(dot)org> wrote:
>>
>> On 2/16/23 2:38 PM, Alvaro Herrera wrote:
>>> On 2023-Feb-16, Jonathan S. Katz wrote:
>>>
>>>> On 2/16/23 12:53 PM, Alvaro Herrera wrote:
>>>
>>>>> I don't think this is the fault of logical replication. Consider that
>>>>> for the backend server, the function source code is just an opaque
>>>>> string that is given to the plpgsql engine to interpret. So there's no
>>>>> way for the logical DDL replication engine to turn this into runnable
>>>>> code if the table name is not qualified.
>>>>
>>>> Sure, that's fair. That said, the example above would fall under a "typical
>>>> use case", i.e. I'm replicating functions that call tables without schema
>>>> qualification. This is pretty common, and as logical replication becomes
>>>> used for more types of workloads (e.g. high availability), we'll definitely
>>>> see this.
>>>
>>> Hmm, I think you're saying that replay should turn check_function_bodies
>>> off, and I think I agree with that.
>>
>> Yes, exactly. +1
>>
>
> But will that be sufficient? I guess such functions can give errors at
> a later stage when invoked at DML or another DDL time. Consider the
> following example:
>
> Pub:
> CREATE PUBLICATION pub FOR ALL TABLES with (ddl = 'all');
>
> Sub:
> (Set check_function_bodies = off in postgresql.conf)
> CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres' PUBLICATION pub;
>
> Pub:
> CREATE FUNCTION t1(a int) RETURNS int AS $$
> select a+1;
> $$ LANGUAGE sql;
> CREATE FUNCTION t(a int) RETURNS int AS $$
> select t1(a);
> $$ LANGUAGE sql;
> CREATE TABLE tbl1 (a int primary key, b text);
> create index idx on tbl1(t(a));
>
> insert into tbl1 values (1,1); -- This insert on publisher causes an
> error on the subscriber. Check subscriber Logs (ERROR: function
> t1(integer) does not exist at character 9.)

I did reproduce this as is. I also reproduced this when I rewrote the
function in PL/pgSQL.

I also did an experiment using PL/v8[1] where I rewrote the functions
above and did two tests: one via SPI, the other via PL/v8's ability to
find and call a function[2]. In both cases, the INSERT statement failed
citing the inability to find the function. The calls did work when I
schema-qualified.

However, when I converted the SQL-only functions to use the SQL standard
syntax (BEGIN ATOMIC), I did not get this error and was able to
successfully use this index with the table on both publisher and
subscriber. I believe this is due to the generated function body having
all of the schema qualifications in it.

> This happens because of the function used in the index expression.
> Now, this is not the only thing, the replication can even fail during
> DDL replication when the function like above is IMMUTABLE and used as
> follows: ALTER TABLE tbl ADD COLUMN d int DEFAULT t(1);
>
> Normally, it is recommended that users can fix such errors by
> schema-qualifying affected names. See commits 11da97024a and
> 582edc369c.

I'm very familiar with those CVEs, but even though these are our
recommended best practices, there is still a lot of code that does not
schema-qualify the names of functions (including many of our own examples ;)

If we're going to say "You can use logical replication to replicate
functions, but you have to ensure you've schema-qualified any function
calls within them," I think that will prevent people from being able to
use this feature, particularly on existing applications.

I guess there's a connection I'm missing here. For the failing examples
above, I look at the pg_proc entries on both the publisher and the
subscriber and they're identical. I'm not understanding why creating and
executing the functions works on the publisher, but it does not on the
subscriber. What additional info would the subscriber need to be able to
successfully run these functions? Would we need to pass in some
additional context, e.g. what the search_path was at the time the
publisher created the function?

Thanks,

Jonathan

[1] https://plv8.github.io/
[2] https://plv8.github.io/#-code-plv8-find_function-code-

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2023-02-19 02:50:47 Re: Who adds the "start transaction" and "commit" to the intended SQL statement in "autocommit" mode?
Previous Message Christophe Pettus 2023-02-19 00:51:33 Re: Who adds the "start transaction" and "commit" to the intended SQL statement in "autocommit" mode?

Browse pgsql-hackers by date

  From Date Subject
Next Message shiy.fnst@fujitsu.com 2023-02-19 02:40:31 "out of relcache_callback_list slots" after multiple calls to pg_logical_slot_get_binary_changes
Previous Message Matthias van de Meent 2023-02-19 01:03:21 Re: Ignoring BRIN for HOT updates (was: -udpates seems broken)