Re: Minimal logical decoding on standbys

From: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, fabriziomello(at)gmail(dot)com, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Minimal logical decoding on standbys
Date: 2022-12-13 16:37:18
Message-ID: 0ae83e69-bda0-0eb8-ef4a-d169292e3f66@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 12/13/22 2:50 PM, Robert Haas wrote:
> On Tue, Dec 13, 2022 at 5:49 AM Drouvot, Bertrand
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>>> I think the real problem here is that
>>> RelationIsAccessibleInLogicalDecoding is returning *the wrong answer*
>>> when the relation is a user-catalog table. It does so because it
>>> relies on RelationIsUsedAsCatalogTable, and that macro relies on
>>> checking whether the reloptions include user_catalog_table.
>>
>> [ confusion ]
>
> Sorry, I meant: RelationIsAccessibleInLogicalDecoding is returning
> *the wrong answer* when the relation is an *INDEX*.
>

Yeah, agree. Will fix it in the next patch proposal (adding the index case in it as you proposed up-thread).

>>> It would be very helpful if there were some place to refer to that
>>> explained the design decisions here, like why the feature we're trying
>>> to get requires this infrastructure around indexes to be added. It
>>> could be in the commit messages, an email message, a README, or
>>> whatever, but right now I don't see it anywhere in here.
>>
>> Like adding something around those lines in the commit message?
>>
>> "
>> On a primary database, any catalog rows that may be needed by a logical decoding replication slot are not removed.
>> This is done thanks to the catalog_xmin associated with the logical replication slot.
>>
>> With logical decoding on standby, in the following cases:
>>
>> - hot_standby_feedback is off
>> - hot_standby_feedback is on but there is no a physical slot between the primary and the standby. Then, hot_standby_feedback will work, but only while the connection is alive (for example a node restart would break it)
>>
>> Then the primary may delete system catalog rows that could be needed by the logical decoding on the standby. Then, it’s mandatory to identify those rows and invalidate the slots that may need them if any.
>> "
>
> This is very helpful, yes. I think perhaps we need to work some of
> this into the code comments someplace, but getting it into the commit
> message would be a good first step.

Thanks, will do.

>
> What I infer from the above is that the overall design looks like this:
>
> - We want to enable logical decoding on standbys, but replay of WAL
> from the primary might remove data that is needed by logical decoding,
> causing replication conflicts much as hot standby does.
> - Our chosen strategy for dealing with this type of replication slot
> is to invalidate logical slots for which needed data has been removed.
> - To do this we need the latestRemovedXid for each change, just as we
> do for physical replication conflicts, but we also need to know
> whether any particular change was to data that logical replication
> might access.
> - We can't rely on the standby's relcache entries for this purpose in
> any way, because the WAL record that causes the problem might be
> replayed before the standby even reaches consistency. (Is this true? I
> think so.)
> - Therefore every WAL record that potentially removes data from the
> index or heap must carry a flag indicating whether or not it is one
> that might be accessed during logical decoding.
>
> Does that sound right?
>

Yeah, that sounds all right to me.
One option could be to add my proposed wording in the commit message and put your wording above in a README.

> It seems kind of unfortunate to have to add payload to a whole bevy of
> record types for this feature. I think it's worth it, both because the
> feature is extremely important,

Agree and I don't think that there is other option than adding some payload in some WAL records (at the very beginning the proposal was to periodically log a new record
that announces the current catalog xmin horizon).

> and also because there aren't any
> record types that fall into this category that are going to be emitted
> so frequently as to make it a performance problem.

+1

If no objections from your side, I'll submit a patch proposal by tomorrow, which:

- get rid of IndexIsAccessibleInLogicalDecoding
- let RelationIsAccessibleInLogicalDecoding deals with the index case
- takes care of the padding where the new bool is added
- convert this new bool to a flag for the xl_heap_visible case (adding a new bit to the already existing flag)
- Add my proposed wording above to the commit message
- Add your proposed wording above in a README

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2022-12-13 16:42:58 Re: Minimal logical decoding on standbys
Previous Message Bharath Rupireddy 2022-12-13 16:02:19 Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures