Re: Improve the connection failure error messages

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
Subject: Re: Improve the connection failure error messages
Date: 2024-01-16 23:26:01
Message-ID: CAHut+Pvcf7P2dHbCeWPM4jQ=yHqf4WFS_C6eVb8V=bcZPMMp7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 13, 2024 at 12:36 AM Aleksander Alekseev
<aleksander(at)timescale(dot)com> wrote:
>
> Hi,
>
> Thanks for the patch.
>
> > Due to this behavior, it is not possible to add a test to show the
> > error message as it is done for CREATE SUBSCRIPTION.
> > Let me know if you think there is another way to add this test.
>
> I believe it can be done with TAP tests, see for instance:
>
> contrib/auto_explain/t/001_auto_explain.pl
>
> However I wouldn't insist on including the test in scope of this
> particular patch. Such a test doesn't currently exist, it can be added
> as a separate patch, and whether this is actually a useful test (all
> the tests consume time after all...) is somewhat debatable. Personally
> I agree that it would be nice to have though.
>
> This being said...
>
> > The ALTER SUBSCRIPTION command does not error out on the user
> > interface if updated with a bad connection string and the connection
> > failure error can only be seen in the respective log file.
>
> I wonder if we should fix this. Again, not necessarily in scope of
> this work, but if this is not a difficult task, again, it would be
> nice to have.
>
> Other than that v2 looks good.
>

OK. I see now that any ALTER of the subscription's connection, even
to some value that fails, will restart a new worker (like ALTER of any
other subscription parameters). For a bad connection, it will continue
to relaunch-worker/ERROR over and over.

test_sub=# \r2024-01-17 09:34:28.665 AEDT [11274] LOG: logical
replication apply worker for subscription "sub4" has started
2024-01-17 09:34:28.666 AEDT [11274] ERROR: could not connect to the
publisher: invalid port number: "-1"
2024-01-17 09:34:28.667 AEDT [928] LOG: background worker "logical
replication apply worker" (PID 11274) exited with exit code 1
dRs su2024-01-17 09:34:33.669 AEDT [11391] LOG: logical replication
apply worker for subscription "sub4" has started
2024-01-17 09:34:33.669 AEDT [11391] ERROR: could not connect to the
publisher: invalid port number: "-1"
2024-01-17 09:34:33.670 AEDT [928] LOG: background worker "logical
replication apply worker" (PID 11391) exited with exit code 1
b4
...

I don't really have any opinion if that behaviour is good or bad, but
anyway, it is deliberate, and IMO it is outside the scope of your
patch, so v2 patch LGTM.

~~

BTW, while experimenting with the bad connection ALTER I also tried
setting 'disable_on_error' like below:

ALTER SUBSCRIPTION sub4 SET (disable_on_error);
ALTER SUBSCRIPTION sub4 CONNECTION 'port = -1';

...but here the subscription did not become DISABLED as I expected it
would do on the next connection error iteration. It remains enabled
and just continues to loop relaunch/ERROR indefinitely same as before.

That looks like it may be a bug. Thoughts?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2024-01-17 00:20:21 Re: [PoC] Improve dead tuple storage for lazy vacuum
Previous Message Melanie Plageman 2024-01-16 23:07:24 Re: Emit fewer vacuum records by reaping removable tuples during pruning