Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away
Date: 2020-07-11 13:58:08
Message-ID: CALj2ACXL5oKfCgogsd-XzriHC4Zkxzsj8j0F60f7-9GDiJQcZg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the comments. Attaching the v2 patch.

>
> > One way, we could solve the above problem is that, upon firing the new
> > foreign query from local backend using the cached connection,
> > (assuming the remote backend that was cached in the local backed got
> > killed for some reasons), instead of failing the query in the local
> > backend, upon detecting error from the remote backend, we could just
> > delete the cached old entry and try getting another connection to
> > remote backend, cache it and proceed to submit the query. This has to
> > happen only at the beginning of remote xact.
> +1.
>
> I think this is a very useful feature.
> In an environment with connection pooling for local, if a remote
> server has a failover or switchover,
> this feature would prevent unexpected errors of local queries after
> recovery of the remote server.

Thanks for backing this feature.

>
> I haven't looked at the code in detail yet, some comments here.
>

Thanks for the comments. Please feel free to review more of the
attached v2 patch.

>
> 1. To keep the previous behavior (and performance), how about allowing
> the user to specify
> whether or not to retry as a GUC parameter or in the FOREIGN SERVER
OPTION?
>

Do we actually need this? We don't encounter much performance with this
connection retry, as
we just do it at the beginning of the remote xact i.e. only once per a
remote session, if we are
able to establish it's well and good otherwise, the query is bound to fail.

If at all, we need one (if there exists a strong reason to have the
option), then the question is
GUC or the SERVER OPTION?

There's a similar discussion going on having GUC at the core level vs
SERVER OPTION for
postgres_fdw in [1].

>
> 2. How about logging a LOG message when retry was success to let us know
> the retry feature worked or how often the retries worked ?
>

In the v1 patch I added the logging messages, but in v2 patch
"postgres_fdw connection retry is successful" is added. Please note that
all the
new logs are added at level "DEBUG3" as all the existing logs are also at
the same
level.

>
> > I couldn't think of adding a test case to the existing postgres_fdw
> > regression test suite with an automated scenario of the remote backend
> > getting killed.
>
> Couldn't you confirm this by adding a test case like the following?
> ===================================================
> BEGIN;
> -- Generate a connection to remote
> SELECT * FROM ft1 LIMIT 1;
>
> -- retrieve pid of postgres_fdw and kill it
> -- could use the other unique identifier (not postgres_fdw but
> fdw_retry_check, etc ) for application name
> SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE
> backend_type = 'client backend' AND application_name = 'postgres_fdw'
>
> -- COMMIT, so next query will should success if connection-retry works
> COMMIT;
> SELECT * FROM ft1 LIMIT 1;
> ===================================================
>

Yes, this way it works. Thanks for the suggestion. I added the test
case to the postgres_fdw regression test suite. v2 patch has these
changes also.

[1] -
https://www.postgresql.org/message-id/CALj2ACVvrp5%3DAVp2PupEm%2BnAC8S4buqR3fJMmaCoc7ftT0aD2A%40mail.gmail.com
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v2-Retry-Cached-Remote-Connections-For-postgres_fdw.patch application/x-patch 10.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2020-07-11 14:08:43 Re: Binary support for pgoutput plugin
Previous Message Sascha Kuhl 2020-07-11 13:37:43 Re: WIP: BRIN multi-range indexes