Re: Logical decoding without slots: decoding in lockstep with recovery

From: Andres Freund <andres(at)anarazel(dot)de>
To: Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Logical decoding without slots: decoding in lockstep with recovery
Date: 2020-12-25 22:51:55
Message-ID: 20201225225155.zfcukwvyzjvz2fgw@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-12-23 14:56:07 +0800, Craig Ringer wrote:
> I want to share an idea I've looked at a few times where I've run into
> situations where logical slots were inadvertently dropped, or where it
> became necessary to decode changes in the past on a slot.
>
> As most of you will know you can't just create a logical slot in the past.
> Even if it was permitted, it'd be unsafe due to catalog_xmin retention
> requirements and missing WAL.
>
> But if we can arrange a physical replica to replay the WAL of interest and
> decode each commit as soon as it's replayed by the startup process, we know
> the needed catalog rows must all exist, so it's safe to decode the change.
>
> So it should be feasible to run logical decoding in standby, even without a
> replication slot, so long as we:
>
> * pause startup process after each xl_xact_commit
> * wake the walsender running logical decoding
> * decode and process until ReorderBufferCommit for the just-committed xact
> returns
> * wake the startup process to decode the up to the next commit

I don't think it's safe to just do this for each xl_xact_commit - we can
remove needed rows at quite a few places, not just around transaction
commit. Rows needed to correctly decode rows earlier in the transaction
might not be available by the time the commit record was logged.

I think you'd basically have to run logical decoding in lockstep with
WAL replay, i.e. replay one record, then call logical decoding for that
record, replay the next record, ...

> Can anyone see any obvious problem with this?

The patch for logical decoding on the standby
https://postgr.es/m/20181212204154.nsxf3gzqv3gesl32%40alap3.anarazel.de
should provide some of the infrastructure to do this properly. Should
really commit it. /me adds an entry to the top of the todo list.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-12-25 23:03:31 Re: Better client reporting for "immediate stop" shutdowns
Previous Message Dmitry Dolgov 2020-12-25 21:16:03 Re: [HACKERS] [PATCH] Generic type subscripting