Timeout when changes are filtered out by the core during logical replication

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: Timeout when changes are filtered out by the core during logical replication
Date: 2022-12-22 13:27:52
Message-ID: CAGEoWWRhD_iiVQ0RvK=neBOzanpsfz5DVvnYxfds9vrio4RxWg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi All,
A customer ran a script dropping a few dozens of users in a transaction.
Before dropping a user they change the ownership of the tables owned by
that user to another user and revoking all the accesses from that user in
the same transaction. There were a few thousand tables whose privileges and
ownership was changed by this transaction. Since all of these changes were
in catalog table, those changes were filtered out
in ReorderBufferProcessTXN()
by the following code
if (!RelationIsLogicallyLogged(relation))
goto change_done;

I tried to reproduce a similar situation through the attached TAP test. For
500 users and 1000 tables, we see that the transaction takes significant
time but logical decoding does not take much time. So with the default 1
min WAL sender and receiver timeout I could not reproduce the timeout.
Beyond that our TAp test itself times out.

But I think there's a possibility that the logical receiver will time out
this way when decoding a sufficiently large transaction which takes more
than the timeout amount of time to decode. So I think we need to
call OutputPluginUpdateProgress() after a regular interval (in terms of
time or number of changes) to consume any feedback from the subscriber or
send a keep-alive message.

Following commit
```
commit 87c1dd246af8ace926645900f02886905b889718
Author: Amit Kapila <akapila(at)postgresql(dot)org>
Date: Wed May 11 10:12:23 2022 +0530

Fix the logical replication timeout during large transactions.

```
fixed a similar problem when the changes were filtered by an output plugin,
but in this case the changes are not being handed over to the output plugin
as well. If we fix it in the core we may not need to handle it in the
output plugin as that commit does. The commit does not have a test case
which I could run to reproduce the timeout.

--
Best Wishes,
Ashutosh

Attachment Content-Type Size
032_long_decoding.pl application/x-perl 5.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dag Lem 2022-12-22 13:27:54 Re: daitch_mokotoff module
Previous Message shiy.fnst@fujitsu.com 2022-12-22 12:48:37 RE: Force streaming every change in logical decoding