Re: Optionally automatically disable logical replication subscriptions on error

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, "Smith, Peter" <peters(at)fast(dot)au(dot)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Optionally automatically disable logical replication subscriptions on error
Date: 2021-12-02 04:48:59
Message-ID: CAA4eK1+VS6LiD844jOArYTXTgWfiDVUmUmbP_1gG-PvYXp_JEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 2, 2021 at 6:35 AM osumi(dot)takamichi(at)fujitsu(dot)com
<osumi(dot)takamichi(at)fujitsu(dot)com> wrote:
>
> On Wednesday, December 1, 2021 10:16 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Updated the patch to include the notification.
>

The patch disables the subscription for non-transient errors. I am not
sure if we can easily make the call to decide whether any particular
error is transient or not. For example, DISK_FULL or OUT_OF_MEMORY
might not rectify itself. Why not just allow to disable the
subscription on any error? And then let the user check the error
either in view or logs and decide whether it would like to enable the
subscription or do something before it (like making space in disk, or
fixing the network).

The other problem I see with this transient error stuff is maintaining
the list of error codes that we think are transient. I think we need a
discussion for each of the error_codes we are listing now and whatever
new error_code we add in the future which doesn't seem like a good
idea.

I think the code to deal with apply worker errors and then disable the
subscription has some flaws. Say, while disabling the subscription if
it leads to another error then I think the original error won't be
reported. Can't we simply emit the error via EmitErrorReport and then
do AbortOutOfAnyTransaction, FlushErrorState, and any other memory
context clean up if required and then disable the subscription after
coming out of catch?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-12-02 04:50:16 Re: Data is copied twice when specifying both child and parent table in publication
Previous Message Ajin Cherian 2021-12-02 04:35:03 Re: row filtering for logical replication