Re: subscription disable_on_error not working after ALTER SUBSCRIPTION set bad conninfo

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
Subject: Re: subscription disable_on_error not working after ALTER SUBSCRIPTION set bad conninfo
Date: 2024-01-18 05:44:51
Message-ID: CAHut+PtoTp6GT7xcYN_BDDQnvr-0CkcuxUCB_YyPVaGL7oMQgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 18, 2024 at 12:55 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
...
>
> Although we can improve it to handle this case too, I'm not sure it's
> a bug. The doc says[1]:
>
> Specifies whether the subscription should be automatically disabled if
> any errors are detected by subscription workers during data
> replication from the publisher.
>
> When an apply worker is trying to establish a connection, it's not
> replicating data from the publisher.
>
> Regards,
>
> [1] https://www.postgresql.org/docs/devel/sql-createsubscription.html#SQL-CREATESUBSCRIPTION-PARAMS-WITH-DISABLE-ON-ERROR
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com

Yeah, I had also seen that wording of those docs. And I agree it
leaves open some room for doubts because strictly from that wording it
can be interpreted that establishing the connection is not actually
"data replication from the publisher" in which case maybe there is no
bug.

OTOH, it was not clear to me if that precise wording was the intention
or not. It could have been written as "Specifies whether the
subscription should be automatically disabled if any errors are
detected by subscription workers." which would mean the same thing 99%
of the time except that would mean that the current behaviour is a
bug.

I tried looking at the original thread where this feature was born [1]
but it is still unclear to me if 'disable_on_error' was meant for
every kind of error or only data replication errors.

Indeed. even the commit message [2] seems to have an each-way bet:
* It talks about errors applying changes --- "Logical replication
apply workers for a subscription can easily get stuck in an infinite
loop of attempting to apply a change..."
* But, it also says it covers any errors --- "When true, both the
tablesync worker and apply worker catch any errors thrown..."

~

Maybe we should be asking ourselves how a user intuitively expects
this option to behave. IMO the answer is right there in the option
name - the subscription says 'disable_on_error' and I got an error, so
I expected the subscription to be disabled. To wriggle out of it by
saying "Ah, but we did not mean _those_ kinds of errors" doesn't quite
seem genuine to me.

======
[1] https://www.postgresql.org/message-id/flat/CAA4eK1KsaVgkO%3DRbjj0bcXZTpeV1QVm0TGkdxZiH73MHfxf6oQ%40mail.gmail.com#d4a0db154fbeca356a494c50ac877ff1
[2] https://github.com/postgres/postgres/commit/705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-01-18 06:17:26 Re: heavily contended lwlocks with long wait queues scale badly
Previous Message Ashutosh Bapat 2024-01-18 05:26:09 Re: Adding facility for injection points (or probe points?) for more advanced tests