Re: [Proposal] Add foreign-server health checks infrastructure

From: Shinya Kato <Shinya11(dot)Kato(at)oss(dot)nttdata(dot)com>
To: kuroda(dot)hayato(at)fujitsu(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [Proposal] Add foreign-server health checks infrastructure
Date: 2021-11-19 08:30:47
Message-ID: 4e64bb8a0a88f5b6ac5a8901377c81af@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021-11-18 21:43, kuroda(dot)hayato(at)fujitsu(dot)com wrote:
> Dear Kato-san,
>
> Thank you for your interest!
>
>> > I also want you to review the postgres_fdw part,
>> > but I think it should not be attached because cfbot cannot understand
>> > such a dependency
>> > and will throw build error. Do you know how to deal with them in this
>> > case?
>>
>> I don't know how to deal with them, but I hope you will attach the
>> PoC,
>> as it may be easier to review.
>
> OK, I attached the PoC along with the dependent patches. Please see
> the zip file.
> add_helth_check_... patch is written by me, and other two patches are
> just copied from [1].
> In the new callback function ConnectionHash is searched sequentially
> and
> WaitEventSetWait() is performed for WL_SOCKET_CLOSED socket event.
> This event is added by the dependent ones.

Thank you for sending the patches!
I confirmed that they can be compiled and tested successfully on CentOS
8.

I haven't looked at the core of the code yet, but I took a quick look at
it.

+ {
+ {"remote_servers_connection_check_interval", PGC_USERSET,
CONN_AUTH_SETTINGS,
+ gettext_noop("Sets the time interval between checks for
disconnection of remote servers."),
+ NULL,
+ GUC_UNIT_MS
+ },
+ &remote_servers_connection_check_interval,
+ 0, 0, INT_MAX,
+ },

If you don't use check_hook, assign_hook and show_hook, you should
explicitly write "NULL, NULL, NULL", as show below.
+ {
+ {"remote_servers_connection_check_interval", PGC_USERSET,
CONN_AUTH_SETTINGS,
+ gettext_noop("Sets the time interval between checks for
disconnection of remote servers."),
+ NULL,
+ GUC_UNIT_MS
+ },
+ &remote_servers_connection_check_interval,
+ 0, 0, INT_MAX,
+ NULL, NULL, NULL
+ },

+ ereport(ERROR,
+ errcode(ERRCODE_CONNECTION_FAILURE),
+ errmsg("Postgres foreign server %s might be down.",
+ server->servername));

According to [1], error messages should start with a lowercase letter
and not use a period.
Also, along with the rest of the code, it is a good idea to enclose the
server name in double quotes.

I'll get back to you once I've read all the code.

[1] https://www.postgresql.org/docs/devel/error-style-guide.html

--
Regards,

--
Shinya Kato
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Soumyadeep Chakraborty 2021-11-19 08:35:04 Re: Unnecessary delay in streaming replication due to replay lag
Previous Message Fujii Masao 2021-11-19 08:18:22 issue in pgfdw_report_error()?