Re: Libpq support to connect to standby server as priority

From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
To: "Smith, Peter" <peters(at)fast(dot)au(dot)fujitsu(dot)com>
Cc: David Steele <david(at)pgmasters(dot)net>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dave Cramer <pg(at)fastcrypt(dot)com>, Jing Wang <jingwangian(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Elvis Pranskevichus <elprans(at)gmail(dot)com>
Subject: Re: Libpq support to connect to standby server as priority
Date: 2020-08-18 11:32:00
Message-ID: CAJcOf-cW9zR4c25DLESQsOmhQExiNxd+pAixUqmX-KkSANjvjA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Peter,

I have updated the patch (attached) based on your comments, with
adjustments made for additional changes based on feedback (which I
tend to agree with) from Robert Haas and Tsunakawa san, who suggested
read-write/read-only should be functionally different to
primary/standby, and not just have "read-write" a synonym for
"primary".
I also thought it appropriate to remove "read-write", "standby" and
"prefer-standby" from accepted values for "target_server_type"
(instead just support "secondary" and "prefer-secondary") to match the
similar targetServerType PGJDBC option.
So currently have as supported option values:

target_session_attrs:
any/read-write/read-only/primary/standby(/secondary)/prefer-standby(/prefer-secondary)
target_server_type: any/primary/secondary/prefer-secondary

See my responses to your review comments below:

>GENERAL COMMENT 1 ("any")
>
>"any" should be included as valid option for target_server_type.
>
>IIUC target_server_type was added mostly to have better alignment with JDBC options.
>Both Vladimir [1] and Dave [2] already said that JDBC does have an "any" option.
>[1] - https://www.postgresql.org/message-id/CAB%3DJe-FwOVE%3D8gR1UDDZRnWZR65fRG40e8zW_U_6mnUqbce68g%40mail.gmail.com
>[2] - https://www.postgresql.org/message-id/CADK3HHJ9316ji7L-97cJBY%3Dwp4E3ddPMn8XdkNz6j8d9u0OhmQ%40mail.gmail.com
>
>Furthermore, the fe-connect.c function makeEmptyPGConn sets default:
>+ conn->requested_server_type = SERVER_TYPE_ANY;
>This means the default type of target_server_type is "any".
>Since this is default, it should also be possible to assign the same value to explicitly.
>
>(Parts of the v17 patch affected by this are itemised below)

GN RESPONSE: After checking the PGJDBC source and previous comments, I agree.
Have updated the patch to allow "any" for target_server_type.

====

>GENERAL COMMENT 2 (Removal of pg_is_in_recovery)
>
>Around 22/3/2019 Hari added a lot of pg_is_in_recovery code in his patch 0006 [1]
>[1] - https://www.postgresql.org/message-id/CAJrrPGd4YeA%2BN%3DxC%2B1XPVoGzMCATJZY4irVQEJ6i0aPqorUi7g%40mail.gmail.com
>
>Much later IIUC the latest v17 patch has taken onboard the recommendation from Alvaro and removed all that code:
>"I would discard the whole thing about checking "SELECT pg_is_in_recovery()"" [2]
>[2] - https://www.postgresql.org/message-id/20191227130828.GA21647%40alvherre.pgsql
>
>However, it seems that not ALL parts of the original code got cleanly removed in v17.
>There are a number of references to CONNECTION_CHECK_RECOVERY and pg_is_in_recovery still lurking.
>
>(Parts of the v17 patch affected by this are itemised below)
>

GN RESPONSE: Agree. The calling code was removed but somehow the
CONNECTION_CHECK_RECOVERY case block (and enum) was not removed. Also,
part of the documentation was not updated, for the case where the
server version is prior to 14.
I have updated the patch to correct this.

====

>COMMENT libpq.sgml (para blocks)
>
>+ <para>
>
>The v17 patch for target_session_attrs and target_server_type help is currently using <para> blocks for each of the possible >option values.
>This format is inconsistent document style with other variables in this SGML.
>Other places are using sub-lists for option values. e.g. look at "six modes" of sslmode.

GN RESPONSE: True, but this was the case BEFORE the patch, and these
options are more complex than ones where sub-lists for option values
are used - there needs to be common explanation of what the option
synonyms are, and how the behaviour is version dependent, so it
doesn't really lend itself to simple list items, that would need to
cross-reference other list items.

====

>COMMENT libpq.sgml (cut/paste parameter description)
>
>I don't think that target_server_type help should be just a cut/paste duplicate of target_session_attrs. It is confusing >because it leaves the reader doubting the purpose of having such a duplication.
>
>Suggest to simplify the target_server_type help like as follows:
>--
>target_server_type
>The purpose of this parameter is to reflect the similar PGJDBC targetServerType.
>The supported options are same as target_session_attrs.
>This parameter overrides any connection type specified by target_session_attrs.
>--

GN RESPONSE: Agree. Will update documentation, though with some
modifications to the wording because of changes in supported option
values already mentioned, and target_session_attrs could contain
non-server-type options in the future.

====

>COMMENT libpq.sgml (pg_is_in_recovery)
>
>(As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery)
>
>+ <para>
>+ If this parameter is set to <literal>standby</literal>, only a connection in which
>+ the server is in recovery mode is considered acceptable. If the server is prior to version 14,
>+ the query <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful
>+ connection; if it returns <literal>t</literal>, it means the server is in recovery mode.
>+ </para>
>
>Suggest change to:
>--
>If this parameter is set to <literal>standby</literal>, only a connection in which the server is in recovery mode is considered >acceptable. The recovery mode state is determined by the value of the in_recovery configuration parameter that is reported by >the server upon successful connection. Otherwise, if the server is prior to version 14, only a connection in which read-write >transactions are not accepted by default is considered acceptable. To determine whether the server supports read-write >transactions, the query SHOW transaction_read_only will be sent upon any successful connection; if it returns on, it means the >server doesn't support read-write transactions.
>--

GN RESPONSE: I've removed the residual references to
pg_is_in_recovery, and updated the documentation in a similar way.

====

>COMMENT libpq.sgml (Oxford comma)
>
>+ <varname>integer_datetimes</varname>,
>+ <varname>standard_conforming_strings</varname> and
>+ <varname>in_recovery</varname>.
>
>Previously there was an Oxford comma (e.g. before the "and"). Now there isn't.
>The v17 patch should not alter the previous listing style.

GN RESPONSE: I have restored the Oxford comma to its former glory.

====

>COMMENT protocol.sgml (Oxford comma)
>
>+ <varname>integer_datetimes</varname>,
>+ <varname>standard_conforming_strings</varname> and
>+ <varname>in_recovery</varname>.
>
>Previously there was an Oxford comma (e.g. before the "and"). Now there isn't.
>The v17 patch should not alter the previous listing style.

GN RESPONSE: I have restored the Oxford comma to its former glory.

====

>QUESTION standby.c - SendRecoveryExitSignal

>I wonder if this function is really necessary?
>IIUC the SendRecoveryExitSignal is only called from one place (xlog.c).
>Why not just call SendSignalToAllBackends directly from there and remove this extra layer?
>

GN RESPONSE: It's not much of a layer. It could be argued that having
a common function for this makes sense, in case additional code needs
to be added (so it's then not repeated/missed in places).

====

COMMENT postgres.c (signal comment)

>+ /* signal that work needs to be done */
>+ recoveryExitInterruptPending = true;
>
>Suggest change comment to say:
>/* flag that work needs to be done */

GN RESPONSE: Agree, have updated the patch.

====

>COMMENT fe-connect.c (sizeof)
>
>- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
>+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
>
>According to the SGML "prefer-secondary" is also an acceptable value for target_session_attrs, so the display field width >should be 17 /* sizeof("prefer-secondary") */ not 15.

GN RESPONSE: I'm not sure about this, it's debatable. The intention of
these settings is to provide information for a "generic database
connection dialog". For "Target-Session-Attrs" I'd probably expect the
dialog to list the option "prefer-standby" rather than the
(PGJDBC-compatible) synonym "prefer-secondary" (whose length of 17 was
used in the case of "Target-Server-Type").

====

>COMMENT fe-connect.c (CONNECTION_CHECK_RECOVERY)
>
>@@ -2310,6 +2461,7 @@ PQconnectPoll(PGconn *conn)
> case CONNECTION_CHECK_WRITABLE:
> case CONNECTION_CONSUME:
> case CONNECTION_GSS_STARTUP:
>+ case CONNECTION_CHECK_RECOVERY:
> break;
>
>(As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery)
>
>Probably this CONNECTION_CHECK_RECOVERY case should be removed.

GN RESPONSE: Agree, removed because it is no longer used.

====

>COMMENT fe-connect.c - function validateAndRecordTargetServerType
>
>As noted in GENERAL COMMENT 1, I suggest "any" needs to be included in this function as a valid option.

GN RESPONSE: Agree, updated patch.

====

>COMMENT fe-connect.c (target_session_attrs validation)
>
>@@ -1396,8 +1425,9 @@ connectOptions2(PGconn *conn)
> */
> if (conn->target_session_attrs)
> {
>- if (strcmp(conn->target_session_attrs, "any") != 0
>- && strcmp(conn->target_session_attrs, "read-write") != 0)
>+ if (strcmp(conn->target_session_attrs, "any") == 0)
>+ conn->requested_server_type = SERVER_TYPE_ANY;
>+ else if (!validateAndRecordTargetServerType(conn->target_session_attrs, &conn->requested_server_type))
>
>I suggest introducing a 2nd function for target_session_attrs (e.g. validateAndRecordTargetSessionAttrs).
>Even though these parameters are functionally the same today, in future they may not be [1].
>[1] - https://www.postgresql.org/message-id/20200109152539.GA29017%40alvherre.pgsql
>
>Regardless, the special "any" handling can be removed from here because (from GENERAL COMMENT 1) the >validateAndRecordTargetServerType should now accept "any".

GN RESPONSE: Agree, have added separate validation functions.

====

>COMMENT fe-connect.c (message typo)
>
>Found an existing typo, unrelated to the v17 patch.
>
>"target_settion_attrs", --> "target_session_attrs",

GN RESPONSE: Have updated the patch to correct that.

====

>COMMENT fe-connect.c (libpq_gettext)
>
>+ printfPQExpBuffer(&conn->errorMessage,
>+ libpq_gettext("invalid target_server_type value: \"%s\"\n"),
>+ conn->target_server_type);
>
>The parameter name "target_server_type" should be separated from the format string as "%s", the same as is done by the >libpq_gettext of the preceding code.

GN RESPONSE: Agree, was not correct in the v17 patch, have updated the patch.

====

>COMMENT fe-connect.c (indentation)
>
>+ goto error_return;
>+ }
> }
>+ else
> conn->whichhost++;
>
>Bad indentation of the else's statement.

GN RESPONSE: Updated the patch to fix that.

====

>COMMENT fe-connect.c (if/else complexity)
>
>+ else if ((conn->in_recovery &&
>+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
>+ (!conn->in_recovery &&
>+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
>+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
>+ {
>
>TBH I was unable to read this code without first drawing up a matrix of combinations to deduce what was going on.
>It should not be so inscrutable.
>
>Suggestion1:
>Consider putting a large comment at the top of this CONNECTION_CHECK_TARGET to give the overview what this code is trying to >acheive.
>...
>
>Suggestion2:
>Consider to separate out the requested_server_type cases instead of trying to hand everything in the same else block. The code >may be a bit longer, but by aligning it more closely with the SGML documentation it can be made easier to understand.

GN RESPONSE: Some slight restructuring has been made and comments
updated to express the logic in words, to assist in understanding.
(It's actually not that bad, but maybe I've been looking at this for too long).

====

>COMMENT fe-connect.c (case CONNECTION_CHECK_RECOVERY)
>
>(As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery)
>
>v17 patch has removed the previous call to pg_is_in_recovery.
>IIUC this means that there is currently no way for the remaining CONNECTION_CHECK_RECOVERY case to even be executed.
>
>If I am correct, then a significant slab of code (~100 lines) can be deleted.
>See case CONNECTION_CHECK_RECOVERY (lines ~ 4007 thru 4110)

GN RESPONSE: Agree, have removed code that is no longer called.

====

>COMMENT fe-connect.c - function freePGConn (missing free?)
>
>There is code to free(conn->target_session_attrs), but there is no code to free target_server_type.
>Appears to be accidental omission.

GN RESPONSE: Have added missing free(), oops.

====

>COMMENT fe-exec.c (altered comment)
>
>- * Special hacks: remember client_encoding and
>+ * Special hacks: remember client_encoding, and
>
>A comma was added.
>Suggest avoid altering comments not directly related to the v17 patch logic.

GN RESPONSE: Have removed the (Oxford!) comma, accidently added.

====

>COMMENT libpq-fe.h (CONNECTION_CHECK_RECOVERY)
>
>+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
>+ CONNECTION_CHECK_RECOVERY /* Check whether server is in recovery */
>
>(As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery)
>
>Probably this CONNECTION_CHECK_RECOVERY case should be removed.

GN RESPONSE: Have removed, no longer used.

>But I do have a couple of additional review comments about the test code.

====

>COMMENT - missing "any" tests
>
>In my earlier code review (previous email) I suggested that "any" should be added as valid option to the target_server_type >parameter.
>
>But this now means there are some missing test cases for
>
>target_server_type = "any"

GN RESPONSE: Have added "any" tests for target_server_type.

====

>COMMENT - negative tests?
>
>IIUC when "standby" (aka "secondary") is specified, and there is no in_recovery server available, then the result should be an >error like "could not make a readonly connection to server "
>
>But I did not find any such error combination tests.
>
>e.g. Where are these test cases?
>
>target_session_attrs = "standby", when no standby is available
>target_session_attrs = "secondary", when no standby is available
>target_server_type = "standby", when no standby is available
>target_server_type = "secondary", when no standby is available
>
>--
>
>And, similarly for "could not make a writable connection to server ".
>
>e.g. Where are these test cases?
>
>target_session_attrs = "primary", when no primary is available
>target_session_attrs = "read-write", when no primary is available
>target_server_type = "primary", when no primary is available
>target_server_type = "read-write", when no primary is available

GN RESPONSE: No such negative tests existed for target_session_attrs
prior to this patch.
I have added some negative tests for both target_session_attrs and
target_server_type.
Note that in the v18 patch, "standby" and "read-write" are no longer
allowed for "target_server_type" (since not PGJDBC driver compatible).
Also, "read-write" is no longer considered a synonym for "primary" -
"read-write" means writeable (non read-only) and "primary" means not
in recovery.
Tests were adjusted accordingly.

Regards,
Greg Nancarrow
Fujitsu Australia

Attachment Content-Type Size
v18-0001-Enhance-libpq-target_session_attrs-and-add-target_se.patch application/octet-stream 56.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2020-08-18 12:06:35 Re: jsonb, collection & postgres_fdw
Previous Message Chris Travers 2020-08-18 11:29:05 Re: recovering from "found xmin ... from before relfrozenxid ..."