Re: logical replication restrictions

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>, "Marcos Pegoraro" <marcos(at)f10(dot)com(dot)br>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: logical replication restrictions
Date: 2022-03-21 00:40:40
Message-ID: d192bc58-e9c7-442c-bae8-63f8e8f43fda@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 28, 2022, at 9:18 PM, Euler Taveira wrote:
> Long time, no patch. Here it is. I will provide documentation in the next
> version. I would appreciate some feedback.
This patch is broken since commit 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33. I
rebased it.

I added documentation that explains how this parameter works. I decided to
rename the parameter from apply_delay to min_apply_delay to use the same
terminology from the physical replication. IMO the new name seems clear that
there isn't a guarantee that we are always x ms behind the publisher. Indeed,
due to processing/transferring the delay might be higher than the specified
interval.

I refactored the way the delay is applied. The previous patch is only covering
a regular transaction. This new one also covers prepared transaction. The
current design intercepts the transaction during the first change (at the time
it will start the transaction to apply the changes) and applies the delay
before effectively starting the transaction. The previous patch uses
begin_replication_step() as this point. However, to support prepared
transactions I changed the apply_delay signature to accepts a timestamp
parameter (because we use another variable to calculate the delay for prepared
transactions -- prepare_time). Hence, the apply_delay() moved to another places
-- apply_handle_begin and apply_handle_begin_prepare().

The new code does not apply the delay in 2 situations:

* STREAM START: streamed transactions might not have commit_time or
prepare_time set. I'm afraid it is not possible to use the referred variables
because at STREAM START time we don't have a transaction commit time. The
protocol could provide a timestamp that indicates when it starts streaming
the transaction then we could use it to apply the delay. Unfortunately, we
don't have it. Having said that this new patch does not apply delay for
streamed transactions.
* non-transaction messages: the delay could be applied to non-transaction
messages too. It is sent independently of the transaction that contains it.
Since the logical replication does not send messages to the subscriber, this
is not an issue. However, consumers that use pgoutput and wants to implement
a delay will require it.

I'm still looking for a way to support streamed transactions without much
surgery into the logical replication protocol.

--
Euler Taveira
EDB https://www.enterprisedb.com/

Attachment Content-Type Size
v2-0001-Time-delayed-logical-replication-subscriber.patch text/x-patch 47.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2022-03-21 00:41:02 Re: Probable CF bot degradation
Previous Message Andres Freund 2022-03-21 00:36:21 Re: Probable CF bot degradation