Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away
Date: 2020-09-29 15:50:48
Message-ID: CALj2ACXb85AHaUg=qZ0sBNR2nye2Q2NuYV14kBL1oXZud=tOJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the comments.

On Tue, Sep 29, 2020 at 7:30 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>
> +1 to add debug3 message there. But this message doesn't seem to
> match with what the error actually happened. What about something like
> "could not start remote transaction on connection %p", instead?
>

Looks better. Changed.

>
> Also maybe it's better to append PQerrorMessage(entry->conn)?
>

Added. Now the log looks like [1].

>
> +-- Generate a connection to remote. Local backend will cache it.
> +SELECT * FROM ft1 LIMIT 1;
>
> The result of this query would not be stable. Probably you need to,
> for example, add ORDER BY or replace * with 1, etc.
>

Changed to SELECT 1 FROM ft1 LIMIT 1;

>
> +-- Retrieve pid of remote backend with application name fdw_retry_check
> +-- and kill it intentionally here. Note that, local backend still has
> +-- the remote connection/backend info in it's cache.
> +SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE
> +backend_type = 'client backend' AND application_name = 'fdw_retry_check';
>
> Isn't this test fragile because there is no gurantee that the target backend
> has exited just after calling pg_terminate_backend()?
>

I think this is okay, because pg_terminate_backend() sends SIGTERM to
the backend, and upon receiving SIGTERM the signal handler die() will
be called and since there is no query being executed on the backend by
the time SIGTERM is received, it will exit immediately. Thoughts?

pqsignal(SIGTERM, die); /* cancel current query and exit */

And also, pg_terminate_backend() returns true in case the backend is
killed successfully, otherwise it returns false. PG_RETURN_BOOL(r
== SIGNAL_BACKEND_SUCCESS);

Attaching v6 patch, please review it further.

[1]
DEBUG: starting remote transaction on connection 0x55cd393a66e0
DEBUG: could not start remote transaction on connection 0x55cd393a66e0
DETAIL: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
DEBUG: closing connection 0x55cd393a66e0 to reestablish a new one
DEBUG: new postgres_fdw connection 0x55cd393a66e0 for server
"foreign_server" (user mapping oid 16398, userid 10)
DEBUG: starting remote transaction on connection 0x55cd393a66e0
DEBUG: closing remote transaction on connection 0x55cd393a66e0

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v6-Retry-Cached-Remote-Connections-For-postgres_fdw.patch application/octet-stream 7.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-09-29 15:53:06 Re: New statistics for tuning WAL buffer size
Previous Message Tom Lane 2020-09-29 15:42:24 Re: Planner making bad choice in alternative subplan decision