RE: [Proposal] Add foreign-server health checks infrastructure

From: "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Kyotaro Horiguchi' <horikyota(dot)ntt(at)gmail(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Shinya11(dot)Kato(at)oss(dot)nttdata(dot)com" <Shinya11(dot)Kato(at)oss(dot)nttdata(dot)com>, "zyu(at)yugabyte(dot)com" <zyu(at)yugabyte(dot)com>, "masao(dot)fujii(at)oss(dot)nttdata(dot)com" <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Subject: RE: [Proposal] Add foreign-server health checks infrastructure
Date: 2022-02-22 02:53:26
Message-ID: TYAPR01MB58664E79C4728A6FFEA46F94F53B9@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Horiguchi-san, Fujii-san,

> > I understood here as removing following mechanism from core:
> >
> > * disable timeout at end of tx.
> > * skip if held off or read commands
>
> I think we're on the same page. Anyway query cancel interrupt is
> ignored while rading input.
>
> > > - If an existing connection is found to be dead, just try canceling
> > > the query (or sending query cancel).
> > > One issue with it is how to show the decent message for the query
> > > cancel, but maybe we can have a global variable that suggests the
> > > reason for the cancel.
> >
> > Currently I have no good idea for that but I'll try.
>
> However, I would like to hear others' opnions about the direction, of
> course.
>

Based on the idea, I re-implemented the feature. Almost all feature is
moved to postgres_fdw. The abstract of my patch is as follows:

# core

* Exposes QueryCancelMessage for error reporting
* Uses above if it was not NULL

# postgres_fdw

* Defines new GUC postgres_fdw.health_check_interval.
It is renamed simpler.
* Registers a timeout when initializing connection hash.
* Raises SIGINT and sets QueryCancelMessage to message.
if detects a connection lost.

I also attached a test as zipped file for keep cfbot quiet.
When connection lost is detected, the following message will show:

```
ERROR: Foreign Server (servername) might be down.
```

How do you think?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
v10_0001_expose_cancel_message.patch application/octet-stream 1.4 KB
v10_0002_add_health_check.patch application/octet-stream 6.7 KB
v10_0003_add_doc.patch application/octet-stream 1.4 KB
v10_0004_add_test.zip application/x-zip-compressed 854 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-02-22 03:37:30 Re: Design of pg_stat_subscription_workers vs pgstats
Previous Message kuroda.hayato@fujitsu.com 2022-02-22 02:51:25 RE: [Proposal] Add foreign-server health checks infrastructure