From: | "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com> |
---|---|
To: | "iwata(dot)aya(at)fujitsu(dot)com" <iwata(dot)aya(at)fujitsu(dot)com> |
Cc: | "'pgsql-hackers(at)lists(dot)postgresql(dot)org'" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | RE: libpq debug log |
Date: | 2021-01-17 05:23:36 |
Message-ID: | TYAPR01MB2990379993AFD6DE211E63B7FEA50@TYAPR01MB2990.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
From: iwata(dot)aya(at)fujitsu(dot)com <iwata(dot)aya(at)fujitsu(dot)com>
> This patch includes these items:
Is there anything else in this revision that is not handled?
Below are my comments.
Also, why don't you try running the regression tests with a temporary modification to PQtrace() to output the trace to a file? The sole purpose is to confirm that this patch makes the test crash (core dump).
(1)
- conn->Pfdebug = debug_port;
+ if (pqTraceInit(conn))
+ {
+ conn->Pfdebug = debug_port;
+ if (conn->Pfdebug != NULL)
+ setlinebuf(conn->Pfdebug);
+ }
+ else
+ {
+ /* XXX report ENOMEM? */
+ fprintf(conn->Pfdebug, "Failed to initialize trace support\n");
+ fflush(conn->Pfdebug);
+ }
}
* When the passed debug_port is NULL, the function should return after calling PQuntrace().
* Is setlinebuf() available on Windows? Maybe you should use setvbuf() instead.
* I don't see the need for separate pqTraceInit() function, because it is only called here.
(2)
+bool
+pqTraceInit(PGconn *conn)
+{
+ /* already done? */
+ if (conn->be_msg != NULL)
+ {
What's this code for? I think it's sufficient that PQuntrace() frees b_msg and PQtrace() always allocates b_msg.
(3)
+ conn->fe_msg = malloc(sizeof(pqFrontendMessage) +
+ MAX_FRONTEND_MSGS * sizeof(pqFrontendMessageField));
+ conn->fe_msg->field_num = 0;
+ if (conn->fe_msg == NULL)
+ {
+ free(conn->be_msg);
+ /* NULL out for the case that fe_msg malloc fails */
+ conn->be_msg = NULL;
+ return false;
+ }
+ conn->fe_msg->field_num = 0;
The memory for the fields array is one entry larger than necessary. The allocation code would be:
+ conn->fe_msg = malloc(offsetof(pqFrontendMessage, fields) +
+ MAX_FRONTEND_MSGS * sizeof(pqFrontendMessageField));
Also, the line with "field_num = 0" appears twice. The first one should be removed.
(4)
+static void
+pqLogMessageByte1(PGconn *conn, char v, PGCommSource commsource)
+{
...
+ if (PG_PROTOCOL_MAJOR(conn->pversion) >= 3)
+ {
I think you can remove the protocol version check in various logging functions, because PQtrace() disables logging when the protocol version is 2.
(5)
@@ -966,10 +966,6 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
pgParameterStatus *pstatus;
pgParameterStatus *prev;
- if (conn->Pfdebug)
- fprintf(conn->Pfdebug, "pqSaveParameterStatus: '%s' = '%s'\n",
- name, value);
-
Where is this information output instead?
(6)
+ * Set up state so that we can trace. NB -- this might be called mutiple
mutiple -> multiple
(7)
+ LOG_FIRST_BYTE, /* logging the first byte identifing the
+ protocol message type */
identifing -> identifying
(8)
+#define pqLogMsgByte1(CONN, CH, SOURCE) \
+((CONN)->Pfdebug ? pqLogMessageByte1(CONN, CH, SOURCE) : 0)
For functions that return void, follow the style of CHECK_FOR_INTERRUPTS() defined in miscadmin.h.
(9)
+ currtime = time(NULL);
+ tmp = localtime(&currtime);
+ strftime(timestr, sizeof(timestr), "%Y-%m-%d %H:%M:%S %Z", tmp);
+
+ fprintf(conn->Pfdebug, "%s %s ", timestr, message_direction);
It's better to have microsecond accuracy, because knowing the accumulation of such fine-grained timings may help to troubleshoot heavily-loaded client cases. You can mimic setup_formatted_log_time() in src/backend/utils/error/elog.c. This is used for the %m marker in log_line_prefix.
(10)
I think it's better to use tabs (or any other character that is less likely to appear in the log field) as a field separator, because it makes it easier to process the log with a spreadsheet or some other tools.
(11)
+ /*
+ * True type of message tagged '\0' is known when next 4 bytes is
+ * checked. So we delay printing message type to pqLogMsgInt()
+ */
+ if (v != '\0')
+ fprintf(conn->Pfdebug, "%s ",
+ pqGetProtocolMsgType((unsigned char) v, commsource));
In what cases does '\0' appear as a message type?
(12)
+static const char *
+pqGetProtocolMsgType(unsigned char c, PGCommSource commsource)
+{
+ if (commsource == FROM_BACKEND && c < lengthof(protocol_message_type_b))
+ return protocol_message_type_b[c];
+ else if (commsource == FROM_FRONTEND && c < lengthof(protocol_message_type_f))
+ return protocol_message_type_f[c];
+ else
+ return "UnknownMessage";
+}
This function returns NULL (=0) when protocol_message_type_b/f[c] is 0. That crashes the caller:
+ fprintf(conn->Pfdebug, "%s ",
+ pqGetProtocolMsgType((unsigned char) v, commsource));
Plus, you may as well print the invalid message type. Why don't you do something like this:
+static const char *
+pqGetProtocolMsgType(unsigned char c, PGCommSource commsource)
+{
+ static char unknown_message[64];
+ char *msg = NULL;
+ if (commsource == FROM_BACKEND && c < lengthof(protocol_message_type_b))
+ msg = protocol_message_type_b[c];
+ else if (commsource == FROM_FRONTEND && c < lengthof(protocol_message_type_f))
+ msg = protocol_message_type_f[c];
+
+ if (msg == NULL)
+ {
+ sprintf(unknown_message, "Unknown message %02x", c);
+ msg = unknown_message;
+ }
+
+ return msg;
+}
(13)
@@ -156,7 +156,12 @@ pqParseInput3(PGconn *conn)
{
/* If not IDLE state, just wait ... */
if (conn->asyncStatus != PGASYNC_IDLE)
+ {
+ /* Terminate a half-finished logging message */
+ if (conn->Pfdebug)
+ pqTraceForcelyBreakLine(msgLength, conn);
return;
+ }
/*
* Unexpected message in IDLE state; need to recover somehow.
What's this situation like? Why output a new line and reset the trace status?
(14)
+/* pqLogInvalidProtocol: Print that the protocol message is invalid */
+static void
+pqLogInvalidProtocol(PGconn *conn)
+{
+ fprintf(conn->Pfdebug, ":::Invalid Protocol\n");
+ conn->be_msg->state = LOG_FIRST_BYTE;
+}
Is it sufficient to just reset the state field? Isn't it necessary to call pqTraceResetBeMsg() instead?
(15)
@@ -212,15 +368,8 @@ pqSkipnchar(size_t len, PGconn *conn)
...
conn->inCursor += len;
-
return 0;
}
This is an unnecessary removal of an empty line.
(16)
+static void
+pqLogMessageByte1(PGconn *conn, char v, PGCommSource commsource)
+{
+ char *message_direction = commsource == FROM_BACKEND ? "<" : ">";
...
+ switch (conn->be_msg->state)
This function handles messages in both directions. But the switch condition is based on the backend message state (b_msg). How is this correct?
(17)
What's the difference between pqLogMsgString() and pqLogMsgnchar()? They both take a length argument. Do they have to be separate functions?
(18)
if (conn->Pfdebug)
- fprintf(conn->Pfdebug, "To backend> %c\n", c);
+ pqStoreFrontendMsg(conn, LOG_BYTE1, 1);
To match the style for backend messages with pqLogMsgByte1 etc., why don't you wrap the conn->Pfdebug check in the macro?
(19)
@@ -520,8 +667,6 @@ pqPutMsgStart(char msg_type, bool force_len, PGconn *conn)
/* allow room for message length */
endPos += 4;
}
- else
- lenPos = -1;
Why is this change necessary?
(20)
+static void
+pqLogFrontendMsg(PGconn *conn)
...
+ for (i = 0; i < conn->fe_msg->field_num; i++)
+ {
+ message_addr = conn->fe_msg->fields[i].offset_in_buffer;
+ length = conn->fe_msg->fields[i].length;
+
+ switch (conn->fe_msg[i].type)
+ {
+ case LOG_BYTE1:
When I saw this switch condition, the confusion exploded. Shouldn't the type should be the attribute of each field as follows?
+ switch (conn->fe_msg->fields[i].type)
I also found the function names for frontend -> backend messages hard to grasp. Could you show the flow of function calls when sending messages to the backend?
(21)
+ uint16 result16;
+ memcpy(&result16, conn->outBuffer + message_addr, length);
You should have an empty line between the declaration part and execution part.
(22)
I couldn't understand pqLogBinaryMsg() at all. Could you explain what it does? Particularly, if all bytes in the string is printable, the function seems to do nothing:
+ if (isprint(v[i]))
+ continue;
Should the following part be placed outside the for loop? What does 'pin' mean? (I want the variable name to be more intuitive.)
+ if (pin < length)
+ fwrite(v + pin, 1, length - pin, conn->Pfdebug);
Regards
Takayuki Tsunakawa
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2021-01-17 05:29:54 | Re: PoC/WIP: Extended statistics on expressions |
Previous Message | tsunakawa.takay@fujitsu.com | 2021-01-17 05:16:09 | RE: Disable WAL logging to speed up data loading |