RE: libpq debug log

From: "iwata(dot)aya(at)fujitsu(dot)com" <iwata(dot)aya(at)fujitsu(dot)com>
To: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>, 'Alvaro Herrera' <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: "'pgsql-hackers(at)lists(dot)postgresql(dot)org'" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: libpq debug log
Date: 2021-02-02 13:53:23
Message-ID: TY2PR01MB1963CCC04A0BA412E71617CFEAB59@TY2PR01MB1963.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

Thank you Kirk san for creating the v14 patch.
I update the patch. I fixed all of Tsunakawa san's review comments.
I am trying to solve three bugs. Two bags were pointed out by Alvaro san
in a previous e-mail. And I found one bug.

> From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
> Sent: Friday, January 22, 2021 6:53 AM
...
> > (5)
> > @@ -966,10 +966,6 @@ pqSaveParameterStatus(PGconn *conn, const char
> *name, const char *value)
> > pgParameterStatus *pstatus;
> > pgParameterStatus *prev;
> >
> > - if (conn->Pfdebug)
> > - fprintf(conn->Pfdebug, "pqSaveParameterStatus: '%s' =
> '%s'\n",
> > - name, value);
> > -
> >
> > Where is this information output instead?
>
> Hmm, seems to have been lost. I restored it, but didn't verify
> the resulting behavior carefully.

I checked log output using Matsumura san's application app.c;
---
2021-01-27 11:29:49.718 < ParameterStatus 22 "application_name" ""
pqSaveParameterStatus: 'application_name' = ''
2021-01-27 11:29:49.718 < ParameterStatus 25 "client_encoding" "UTF8"
pqSaveParameterStatus: 'client_encoding' = 'UTF8'
---

New trace log prints "ParameterStatus" which report run-time parameter status.
And new log also prints parameter name and value in " StartupMessage " message.
We can know the parameter status using these protocol messages.
So I think "pqSaveParameterStatus:" is not necessary to output.

In StartupMessage, the protocol message is output as "UnknownMessage" like this.
[...] > UnknownMessage 38 '\x00\x03\x00\x00user\x00iwata\x00database\x00postgres\x00\x00'

To fix this, I want to move a protocol message without the first byte1 to a frontend message logging function.
I will work on this.

> From: 'Alvaro Herrera' <alvherre(at)alvh(dot)no-ip(dot)org>
> Sent: Friday, January 22, 2021 10:38 PM
> > Ah, that was the reason for separation. Then, I'm fine with either
> > keeping the separation, or integrating the two functions in fe-misc.c
> > with PQuntrace() being also there. I kind of think the latter is
> > better, because of code readability and, because tracing facility is
> > not a connection-related reature so categorizing it as "misc" feels
> > natural.
>
> Maybe we can create a new file specifically for this to avoid mixing
> unrelated stuff in fe-misc.c -- say fe-trace.c where all this new
> tracing stuff goes.

I moved PQtrace() from fe-connect.c to fe-misc.c.
And I agree with creating new file for new tracing functions. I will do this.

> From: 'Alvaro Herrera' <alvherre(at)alvh(dot)no-ip(dot)org>
> Sent: Thursday, January 28, 2021 11:14 PM
...
> On 2021-Jan-28, k(dot)jamison(at)fujitsu(dot)com wrote:
>
> > I realized that existing libpq regression tests in src/test/examples do not
> > test PQtrace(). At the same time, no other functions Is calling PQtrace(),
> > so it's expected that by default if there are no compilation errors, then it
> > will pass all the tests. And it did.
> >
> > So to check that the tracing feature is enabled and, as requested, outputs
> > a trace file, I temporarily modified a bit of testlibpq.c instead **based from
> > my current environment settings**, and ran the regression test.
>
> Yeah. It seems useful to build a real test harness based on the new
> tracing functionality. And that is precisely why I added the option to
> suppress timestamps: so that we can write trace files that do not vary
> from run to run. That allows us to have an "expected" trace to which we
> can compare the trace file produced by the actual run. I had the idea
> that instead of a timestamp we would print a message counter. I didn't
> implement that, however.

I think it is a useful suggestion. However I couldn't think of how to find out
if the logs were really in the correct order. And implementing this change looks very complicated.
So I would like to brush up this patch at first.

> (29)
> -void PQtrace(PGconn *conn, FILE *stream);
> +void PQtrace(PGconn *conn, FILE *stream, int flags);
>
> As I said before, I'm afraid we cannot change the signature of existing API
> functions. Please leave the signature of this function unchanged, and add a
> new function like PQtraceEx() that adds the flags argument.
>
> I guess the purpose of adding the flag argument is to not output the timestamp
> by default, because getting timestamps would incur significant overhead
> (however, I'm not sure if that overhead is worth worrying about given the
> already incurred logging overhead.) So, I think it's better to have a flag like
> PQTRACE_OUTPUT_TIMESTAMPS instead of
> PQTRACE_SUPPRESS_TIMESTAMPS, and the functions would look like:
>
> void
> PQtrace(PGconn *conn, FILE *stream)
> {
> PQtraceEx(conn, stream, 0);
> }
>
> void
> PQtraceEx(PGconn *conn, FILE *stream, int flags)
> {
> ... the new implementation in the patch
> }
>
> Remember to add PQtraceEx in exports.txt and make sure it builds on
> Windows with Visual Studio.

I fixed just add new function.
I will look into similar changes so far and give PQtraceEx() a better name.

> But... can you evaluate the overhead for getting timestamps and see if we can
> turn it on by default, or further, if we need such a flag and PQtraceEx()? For
> example, how about comparing the times to run the regression test with and
> without timestamps?

I will check it. And decide which is better by default.
In my understanding, one of the purpose of this flag is to make sure
the log order is correct regardless of timestamp. (for example, it is useful auto-regression test)
I think PQtraceEx() and this flag is necessary as a developer option at least.

> (30)
> +/*
> + * Deallocate FE/BE message tracking memory. We only do this because
> + * FE message can grow arbitrarily large, and skip it in the initial state,
> + * because it's likely pointless.
> + */
> +void
> +pqTraceUninit(PGconn *conn)
> +{
> + if (conn->fe_msg &&
> + conn->fe_msg->num_fields != DEF_FE_MSGFIELDS)
> + {
> + free(conn->fe_msg);
> + conn->fe_msg = NULL;
> + }
>
> What does the second if condition mean? If it's not necessary, change the
> comment accordingly.

I fix the comment and remove second if.

> (31)
> @@ -1011,11 +1615,16 @@ pqSendSome(PGconn *conn, int len)
> int
> pqFlush(PGconn *conn)
> {
> - if (conn->Pfdebug)
> - fflush(conn->Pfdebug);
> -
> if (conn->outCount > 0)
> + {
> + /* XXX comment */
> + if (conn->Pfdebug)
> + {
> + pqLogFrontendMsg(conn, -1);
> + fflush(conn->Pfdebug);
> + }
> return pqSendSome(conn, conn->outCount);
> + }
>
> Remove the comment line.

I remove it.

> (32)
> +#define INT_MAX (2048/2) /* maximum array size */
>
> INT_MAX is available via:
> #include <limits.h>

I fixed it.

> (33)
> /* realloc failed. Probably out of memory */
> - appendPQExpBufferStr(&conn->errorMessage,
> - "cannot allocate memory
> for input buffer\n");
> + appendPQExpBuffer(&conn->errorMessage,
> + "cannot allocate memory for input
> buffer\n");
>
> This change appears to be irrelevant.

I remove this code.

> (34)
> +static void
> +pqStoreFeMsgStart(PGconn *conn, char type)
> +{
> + if (PG_PROTOCOL_MAJOR(conn->pversion) >= 3)
> + conn->fe_msg->msg_type = type;
> +}
>
> Protocol version check is unnecessary because this tracing takes effect only in
> v3 protocol?

Yes, in PQtrace(), it is checked whether protocol 2.0 or 3.0 and tracing works only in v3.0. I remove it.

> (35)
> + conn->fe_msg =
> + realloc(conn->fe_msg,
> + sizeof(pqFrontendMessage) +
> + 2 * conn->fe_msg->max_fields *
> sizeof(pqFrontendMessageField));
>
> Align this size calculation with that in pqTraceInit().

I fixed it like this;
+ conn->fe_msg =
+ realloc(conn->fe_msg,
+ offsetof(pqFrontendMessage, fields) +
+ 2 * conn->fe_msg->max_fields * sizeof(pqFrontendMessageField));

> (36)
> + /* append milliseconds */
> + sprintf(timestr + strlen(timestr), ".%03d", (int) (tval.tv_usec / 1000));
>
> Output microsecond instead. As your example output shows, many lines
> show the exactly same timestamps and thus they are not distinguishable in
> time.

I changed %03d to %06d for microsecond.

> (37)
> +static char *
> +pqLogFormatTimestamp(char *timestr)
> +{
> ...
> +#define FORMATTED_TS_LEN 128
> + strftime(timestr, FORMATTED_TS_LEN,
>
> Add an argument to this function that indicates the size of timestr. Define
> FORMATTED_TS_LEN globally in this source file, and have callers use it.
> That is, the code like:
>
> + char timestr[128];
> +
> + fprintf(conn->Pfdebug, "%s\t>\t%s\t%d",
> + pqLogFormatTimestamp(timestr),
>
> becomes:
>
> + char timestr[FORMATTED_TS_LEN];
> +
> + fprintf(conn->Pfdebug, "%s\t>\t%s\t%d",
> + pqLogFormatTimestamp(timestr,
> sizeof(timestr)),

I fixed it.

> (38)
> + if ((conn->traceFlags & PQTRACE_SUPPRESS_TIMESTAMPS) == 0)
> + {
> + char timestr[128];
> +
> + fprintf(conn->Pfdebug, "%s\t>\t%s\t%d",
> + pqLogFormatTimestamp(timestr),
> +
> pqGetProtocolMsgType(conn->fe_msg->msg_type,
> +
> MSGDIR_FROM_FRONTEND),
> + msgLen);
> + }
> + else
> + fprintf(conn->Pfdebug, ">\t%s\t%d",
> +
> pqGetProtocolMsgType(conn->fe_msg->msg_type,
> +
> MSGDIR_FROM_FRONTEND),
> + msgLen);
>
> To reduce the code for easier modification, change the above code to
> something like:
>
> + char timestr[128];
> + char *timestr_p = "";
> +
> + if ((conn->traceFlags & PQTRACE_SUPPRESS_TIMESTAMPS) == 0)
> + timestr_p = pqLogFormatTimestamp(timestr);
> + fprintf(conn->Pfdebug, "%s\t>\t%s\t%d",
> + timestr_p,
> + pqGetProtocolMsgType(conn->fe_msg->msg_type,
> +
> MSGDIR_FROM_FRONTEND),
> + msgLen);

I fixed it.

This patch should address the following:
1. fix 3 bugs
1.1 -1 output in "Query" message
1.2 two message output in "ReadyForQuery" message
1.3 "StartupMessage" output as " UnknownMessage "
2. creating new file for new tracing functions

Regards,
Aya Iwata
Fujitsu

Attachment Content-Type Size
v15-0001-libpq-trace.patch application/octet-stream 29.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-02-02 13:53:37 Re: Typo in tablesync comment
Previous Message Amit Kapila 2021-02-02 13:26:15 Re: Single transaction in the tablesync worker?