Re: 10.0: Logical replication doesn't execute BEFORE UPDATE OF <columns> trigger

From: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: 10.0: Logical replication doesn't execute BEFORE UPDATE OF <columns> trigger
Date: 2017-10-10 13:25:14
Message-ID: da37a748-0d72-617c-92bd-aa4299dab096@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On 10/10/17 09:53, Masahiko Sawada wrote:
> On Tue, Oct 10, 2017 at 11:29 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> On Mon, Oct 9, 2017 at 11:13 PM, Aleksander Alekseev
>> <a(dot)alekseev(at)postgrespro(dot)ru> wrote:
>>> Hi hackers,
>>>
>>> I've found something that looks like a bug.
>>>
>>> Steps to reproduce
>>> ------------------
>>>
>>> There are 3 instances of PostgreSQL 10.0 - inst1, inst2 and inst3. There
>>> is a table `test` on every instance:
>>>
>>> ```
>>> CREATE TABLE test(k TEXT PRIMARY KEY, v TEXT);
>>> ```
>>>
>>> Both inst1 and inst2 have `allpub` publication:
>>>
>>> ```
>>> CREATE PUBLICATION allpub FOR ALL TABLES;
>>> ```
>>>
>>> ... and inst3 is subscribed for both publications:
>>>
>>> ```
>>> CREATE SUBSCRIPTION allsub1
>>> CONNECTION 'host=10.128.0.16 user=eax dbname=eax'
>>> PUBLICATION allpub;
>>>
>>> CREATE SUBSCRIPTION allsub2
>>> CONNECTION 'host=10.128.0.26 user=eax dbname=eax'
>>> PUBLICATION allpub;
>>> ```
>>>
>>> So basically it's two masters, one replica configuration. To resolve
>>> insert/update conflicts I've created the following triggers on inst3:
>>>
>>> ```
>>> CREATE OR REPLACE FUNCTION test_before_insert()
>>> RETURNS trigger AS $$
>>> BEGIN
>>>
>>> RAISE NOTICE 'test_before_insert trigger executed';
>>>
>>> IF EXISTS (SELECT 1 FROM test where k = new.k) THEN
>>> RAISE NOTICE 'test_before_insert trigger - merging data';
>>> UPDATE test SET v = v || ';' || new.v WHERE k = new.k;
>>> RETURN NULL;
>>> END IF;
>>>
>>> RETURN NEW;
>>>
>>> END
>>> $$ LANGUAGE plpgsql;
>>>
>>>
>>> CREATE OR REPLACE FUNCTION test_before_update()
>>> RETURNS trigger AS $$
>>> BEGIN
>>>
>>> RAISE NOTICE 'test_before_update trigger executed';
>>>
>>> IF EXISTS (SELECT 1 FROM test where k = new.k) THEN
>>> RAISE NOTICE 'test_before_update trigger - merging data';
>>> UPDATE test SET v = v || ';' || new.v WHERE k = new.k;
>>> DELETE FROM test where k = old.k;
>>> RETURN NULL;
>>> END IF;
>>>
>>> RETURN NEW;
>>>
>>> END
>>> $$ LANGUAGE plpgsql;
>>>
>>> create trigger test_before_insert_trigger
>>> before insert on test
>>> for each row execute procedure test_before_insert();
>>>
>>> create trigger test_before_update_trigger
>>> before update of k on test
>>> for each row execute procedure test_before_update();
>>>
>>> ALTER TABLE test ENABLE REPLICA TRIGGER test_before_insert_trigger;
>>> ALTER TABLE test ENABLE REPLICA TRIGGER test_before_update_trigger;
>>> ```
>>>
>>> The INSERT trigger works just as expected, however the UPDATE trigger
>>> doesn't. On inst1:
>>>
>>> ```
>>> insert into test values ('k1', 'v1');
>>> ```
>>>
>>> In inst2:
>>>
>>> ```
>>> insert into test values ('k4', 'v4');
>>> update test set k = 'k1' where k = 'k4';
>>> ```
>>>
>>> Now on inst3:
>>>
>>> ```
>>> select * from test;
>>> ```
>>>
>>> Expected result
>>> ---------------
>>>
>>> Rows are merged to:
>>>
>>> ```
>>> k | v
>>> ----+-------
>>> k1 | v1;v4
>>> ```
>>>
>>> This is what would happen if all insert/update queries would have been
>>> executed on one instance.
>>>
>>> Actual result
>>> -------------
>>>
>>> Replication fails, log contains:
>>>
>>> ```
>>> [3227] ERROR: duplicate key value violates unique constraint "test_pkey"
>>> [3227] DETAIL: Key (k)=(k1) already exists.
>>> [3176] LOG: worker process: logical replication worker for subscription 16402 (PID 3227) exited with exit code 1
>>> ```
>>>
>>> What do you think?
>>>
>>
>> I think the cause of this issue is that the apply worker doesn't set
>> updatedCols of RangeTblEntry when applying updates. So TriggerEnabled
>> always ends up with false. I'll make a patch and submit.
>>
>
> Attached patch store the updated columns bitmap set to RangeTblEntry.
> In my environment this bug seems to be fixed by the patch.
>
Hi,

let me start by saying that my view is that this is simply a
documentation bug. Meaning that I didn't document that it does not work,
but I also never intended it to work. Main reason is that we can't
support the semantics of "UPDATE OF" correctly (see bellow). And I think
it's better to not support something at all rather than making it behave
differently in different cases.

Now about the proposed patch, I don't think this is correct way to
support this as it will only work when either PRIMARY KEY column was
changed or when REPLICA IDENTITY is set to FULL for the table. And even
then it will have very different semantics from how it works when the
row is updated by SQL statement (all non-toasted columns will be
reported as changed regardless of actually being changed or not).

The more proper way to do this would be to run data comparison of the
new tuple and the existing tuple values which a) will have different
behavior than normal "UPDATE OF" triggers have because we still can't
tell what columns were mentioned in the original query and b) will not
exactly help performance (and perhaps c) one can write the trigger to
behave this way already without impacting all other use-cases).

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2017-10-10 13:58:58 Re: BUG #14830: Missed NOTIFications, PostgreSQL 9.1.24
Previous Message Marko Tiikkaja 2017-10-10 12:39:49 Re: BUG #14830: Missed NOTIFications, PostgreSQL 9.1.24

Browse pgsql-hackers by date

  From Date Subject
Next Message Gourav Kumar 2017-10-10 13:59:24 How does postgres store the join predicate for a relation in a given query
Previous Message Michael Paquier 2017-10-10 13:12:25 Re: [JDBC] Channel binding support for SCRAM-SHA-256