Re: Minimal logical decoding on standbys

From: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, fabriziomello(at)gmail(dot)com, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Minimal logical decoding on standbys
Date: 2022-12-02 09:32:23
Message-ID: b0366278-a48a-1af2-c389-e84c1e207d06@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 11/25/22 11:26 AM, Drouvot, Bertrand wrote:
> Hi,
>
> On 9/30/22 2:11 PM, Drouvot, Bertrand wrote:
>> Hi,
>>
>> On 7/6/22 3:30 PM, Drouvot, Bertrand wrote:
>>> Hi,
>>>
>>> On 10/28/21 11:07 PM, Andres Freund wrote:
>>>> Hi,
>>>>
>>>> On 2021-10-28 16:24:22 -0400, Robert Haas wrote:
>>>>> On Wed, Oct 27, 2021 at 2:56 AM Drouvot, Bertrand <bdrouvot(at)amazon(dot)com> wrote:
>>>>>> So you have in mind to check for XLogLogicalInfoActive() first, and if true, then open the relation and call
>>>>>> RelationIsAccessibleInLogicalDecoding()?
>>>>> I think 0001 is utterly unacceptable. We cannot add calls to
>>>>> table_open() in low-level functions like this. Suppose for example
>>>>> that _bt_getbuf() calls _bt_log_reuse_page() which with 0001 applied
>>>>> would call get_rel_logical_catalog(). _bt_getbuf() will have acquired
>>>>> a buffer lock on the page. The idea that it's safe to call
>>>>> table_open() while holding a buffer lock cannot be taken seriously.
>>>> Yes - that's pretty clearly a deadlock hazard. It shouldn't too hard to fix, I
>>>> think. Possibly a bit more verbose than nice, but...
>>>>
>>>> Alternatively we could propagate the information whether a relcache entry is
>>>> for a catalog from the table to the index. Then we'd not need to change the
>>>> btree code to pass the table down.
>>>
>>> Looking closer at RelationIsAccessibleInLogicalDecoding() It seems to me that the missing part to be able to tell whether or not an index is for a catalog is the rd_options->user_catalog_table value of its related heap relation.
>>>
>>> Then, a way to achieve that could be to:
>>>
>>> - Add to Relation a new "heap_rd_options" representing the rd_options of the related heap relation when appropriate
>>>
>>> - Trigger the related indexes relcache invalidations when an ATExecSetRelOptions() is triggered on a heap relation
>>>
>>> - Write an equivalent of RelationIsUsedAsCatalogTable() for indexes that would make use of the heap_rd_options instead
>>>
>>> Does that sound like a valid option to you or do you have another idea in mind to propagate the information whether a relcache entry is for a catalog from the table to the index?
>>>
>>
>> I ended up with the attached proposal to propagate the catalog information to the indexes.
>>
>> The attached adds a new field "isusercatalog" in pg_index to indicate whether or not the index is linked to a table that has the storage parameter user_catalog_table set to true.
>>
>> Then it defines new macros, including "IndexIsAccessibleInLogicalDecoding" making use of this new field.
>>
>> This new macro replaces get_rel_logical_catalog() that was part of the previous patch version.
>>
>> What do you think about this approach and the attached?
>>
>> If that sounds reasonable, then I'll add tap tests for it and try to improve the way isusercatalog is propagated to the index(es) in case a reset is done on user_catalog_table on the table (currently in this POC patch, it's hardcoded to "false" which is the default value for user_catalog_table in boolRelOpts[]) (A better approach would be probably to retrieve the value from the table once the reset is done and then propagate it to the index(es).)
>
> Please find attached a rebase to propagate the catalog information to the indexes.
> It also takes care of the RESET on user_catalog_table (adding a new Macro "HEAP_DEFAULT_USER_CATALOG_TABLE") and adds a few tests in contrib/test_decoding/sql/ddl.sql.

Please find attached a new patch series:

v27-0001-Add-info-in-WAL-records-in-preparation-for-logic.patch
v27-0002-Handle-logical-slot-conflicts-on-standby.patch
v27-0003-Allow-logical-decoding-on-standby.patch
v27-0004-New-TAP-test-for-logical-decoding-on-standby.patch
v27-0005-Doc-changes-describing-details-about-logical-dec.patch
v27-0006-Fixing-Walsender-corner-case-with-logical-decodi.patch

with the previous comments addressed, means mainly:

1/ don't call table_open() in low-level functions in 0001: this is done with a new field "isusercatalog" in pg_index to indicate whether or not the index is linked to a table that has the storage parameter user_catalog_table set to true (we may want to make this field "invisible" though). This new field is then used in the new IndexIsAccessibleInLogicalDecoding Macro (through IndexIsUserCatalog).

2/ Renaming the new field generated in the xlog record (to arrange conflict handling) from "onCatalogTable" to "onCatalogAccessibleInLogicalDecoding" to avoid any confusion (see 0001).

3/ Making sure that "currTLI" is the current one in logical_read_xlog_page() (see 0003).

4/ Fixing Walsender/startup process corner case: It's done in 0006 (I thought it is better to keep the other patches purely "feature" related and to address this corner case separately to ease the review). The fix is making use of a new

condition variable "replayedCV" so that the startup process can broadcast the walsender(s) once a replay is done.

Remarks:

- The new confl_active_logicalslot field added in pg_stat_database_conflicts (see 0002) is incremented only if the slot being invalidated is active (I think it makes more sense in regard to the other fields too). In all the cases (active/not active) the slot invalidation is reported in the logfile. The documentation update mentions this behavior (see 0002).

- LogStandbySnapshot() being moved outside of the loop in ReplicationSlotReserveWal() (see 0003), is a proposal made by Andres in [1] and I think it makes sense.

- Tap tests (see 0004) are covering: tests that the logical decoding on standby behaves correctly, conflicts, slots invalidations, standby promotion.

Looking forward to your feedback,

[1]: https://www.postgresql.org/message-id/20210406180231.qsnkyrgrm7gtxb73%40alap3.anarazel.de

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v27-0006-Fixing-Walsender-corner-case-with-logical-decodi.patch text/plain 7.5 KB
v27-0005-Doc-changes-describing-details-about-logical-dec.patch text/plain 2.1 KB
v27-0004-New-TAP-test-for-logical-decoding-on-standby.patch text/plain 20.7 KB
v27-0003-Allow-logical-decoding-on-standby.patch text/plain 11.5 KB
v27-0002-Handle-logical-slot-conflicts-on-standby.patch text/plain 28.2 KB
v27-0001-Add-info-in-WAL-records-in-preparation-for-logic.patch text/plain 25.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Niyas Sait 2022-12-02 09:37:43 Re: [PATCH] Add native windows on arm64 support
Previous Message Daniel Gustafsson 2022-12-02 08:34:11 Re: pg_dump: Remove "blob" terminology