RE: [Proposal] Add foreign-server health checks infrastructure

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Katsuragi Yuta' <katsuragiy(at)oss(dot)nttdata(dot)com>
Cc: 'Ted Yu' <yuzhihong(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, vignesh C <vignesh21(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, "Shinya11(dot)Kato(at)oss(dot)nttdata(dot)com" <Shinya11(dot)Kato(at)oss(dot)nttdata(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Subject: RE: [Proposal] Add foreign-server health checks infrastructure
Date: 2023-03-06 11:56:42
Message-ID: TYCPR01MB58700498838127C54BC9BD48F5B69@TYCPR01MB5870.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Katsuragi-san,

Thank you for reviewing! PSA new version.

> >> 4. the code of pqSocketPoll
> >> +#if defined(POLLRDHUP)
> >> + if (forConnCheck)
> >> + input_fd.events |= POLLRDHUP;
> >> +#endif
> >>
> >> I think it is better to use PQconnCheckable() to remove the macro.
> >
> > IIUC the macro is needed. In FreeBSD, macOS and other platforms do not
> > have the
> > macro POLLRDHUP so they cannot compile. I checked by my CI and got
> > following error.
> >
> > ```
> > ...
> > FAILED: src/interfaces/libpq/libpq.a.p/fe-misc.c.o
> > ...
> > ../src/interfaces/libpq/fe-misc.c:1149:22: error: use of undeclared
> > identifier 'POLLRDHUP'
> > input_fd.events |= POLLRDHUP;
> > ````
> >
> > It must be invisible from them.
>
> Sorry, my mistake...

No issues :-).

> >> 9. the document of postgres_fdw
> >> The document of postgres_fdw_verify_connection_states_* is a little
> >> bit old. Could you update it?
> >
> > Updated. IIUC postgres_fdw_verify_connection_states returns
> >
> > * true, if the connection is verified.
> > * false, if the connection seems to be disconnected.
> > * NULL, if this is not the supported platform or connection has not
> > been established.
> >
> > And postgres_fdw_verify_connection_states_all returns
> >
> > * true if all the connections are verified.
> > * false, if one of connections seems to be disconnected.
> > * NULL, if this is not the supported platform or this backend has
> > never established connections
>
> I think 'backend has never established connections' is a little bit
> strong.
> I think the following cases also return NULL. The case where a
> connection was established, however the connection is now closed
> by the postgres_fdw_disconnect() or something. NULL is also returned if
> the connection is invalidated. So, I think 'NULL, if no valid
> connection is established or the function is not supported on
> the platform.' is better. What do you think?

Disconnect functions have never been in my mind. Descriptions must be updated.

> Followings are my comments for v34. Please check.
>
> 0001:
> 1. the document of pqConnCheck
> + <literal>0</literal> if the socket is valid, and returns
> <literal>-1</literal>
> + if the connection has already been closed or an error has
> occurred.
>
> 1.1 if the socket is valid -> returns 0 if the 'connection is not
> closed'?

Fixed.

> 1.2 returns -1 if the connection has already been closed <- Let me ask
> a question.
> Isn't this a situation where we would like to check using this
> function? Is 'error has occurred' insufficient?

Seems right, fixed.

> 2. the comment of pqSocketCheck
> + * means that the socket has not matched POLLRDHUP event and the socket
> has
> + * still survived.
>
> socket has still survived -> connection is not closed by the socket
> peer?

Fixed.

> 3. the comment of pqSocketPoll
> + * returned and forConnCheck is requested, it means that the socket has
> not
> + * matched POLLRDHUP event and the socket has still survived.
>
> socket has still survived -> connection is not closed by the socket
> peer?

Fixed.

> 0002:
> 4. the comment of verify_cached_connections
> + * This function emits warnings if a disconnection is found. This
> return true
> + * if disconnections cannot be found, otherwise return false.
>
> return ture -> return's' true
> return false -> return's' false

Fixed.

> 5. the comment of postgres_fdw_verify_connection_states
> + * if the local session seems to be disconnected from other servers.
> NULL is
> + * returned if a connection to the specified foreign server has not
> been
> + * established yet, or this function is not available on this platform.
>
> Considering the above discussion, 'NULL is returned if a valid
> connection to the specified foreign server is not established or
> this function...' seems better. What do you think?

Right, fixed.

> 6. the document of postgres_fdw_verify_connection_states
> <literal>NULL</literal>
> + is returned if a connection to the specified foreign server has
> not been
> + established yet, or this function is not available on this
> platform
>
> The same as comment no.5.

Right, fixed.

> 7. the document of postgres_fdw_verify_connection_states_all
> <literal>NULL</literal>
> + is returned if the local session does not have connection caches,
> or this
> + function is not available on this platform.
>
> I think there is a case where a connection cache exists but valid
> connections do not exist and NULL is returned (disconnection case).
> Almost the same document as the postgres_fdw_verify_connection_states
> case (comment no.5) seems better.

Yes, but completely same statement cannot be used because these is not
specified foreign server. How about:
NULL is returned if there are no established connections or this function ...

> 8. the document of postgres_fdw_can_verify_connection_states
> + This function checks whether
> <function>postgres_fdw_verify_connection_states</function>
> + and <function>postgres_fdw_verify_connection_states</function>
> work well
>
> Should the latter (or former) postgres_fdw_verify_connection_states be
> postgres_fdw_verify_connection_states_all?

That was copy-and-paste error, fixed.

> regards,
>
> --
> Katsuragi Yuta
> Advanced Computing Technology Center
> Research and Development Headquarters
> NTT DATA CORPORATION

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
v35-0001-Add-PQconnCheck-and-PQconnCheckable-to-libpq.patch application/octet-stream 8.7 KB
v35-0002-postgres_fdw-add-postgres_fdw_verify_connection_.patch application/octet-stream 12.3 KB
v35-0003-add-test.patch application/octet-stream 5.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2023-03-06 12:24:56 Re: GUC for temporarily disabling event triggers
Previous Message Amit Kapila 2023-03-06 11:32:38 Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher