Re: Patch to implement pg_current_logfile() function

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Gilles Darold <gilles(dot)darold(at)dalibo(dot)com>
Cc: "Karl O(dot) Pinc" <kop(at)meme(dot)com>, Christoph Berg <myon(at)debian(dot)org>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch to implement pg_current_logfile() function
Date: 2016-12-07 23:52:24
Message-ID: CA+TgmoZrO5+hYA1TX9DwgvK9vFmtvoxsoRubntgyC=LMQkkUow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 6, 2016 at 6:11 PM, Gilles Darold <gilles(dot)darold(at)dalibo(dot)com> wrote:
> It seems that all fixes have been included in this patch.

Yikes. This patch has certainly had a lot of review, but it has lots
of basic inconsistencies with PostgreSQL coding style, which one would
think somebody would have noticed by now, and there are multiple
places where the logic seems to do in a complicated way something that
could be done significantly more simply. I don't have any objection
to the basic idea of this patch, but it's got to look and feel like
the surrounding PostgreSQL code. And not be unnecessarily
complicated.

Detailed comments below:

The SGML doesn't match the surrounding style - for example, <para>
typically appears on a line by itself.

+ if (!Logging_collector) {

Formatting.

rm_log_metainfo() could be merged into logfile_writename(), since
they're always called in conjunction and in the same pattern. The
function is poorly named; it should be something like
update_metainfo_datafile().

+ if (errno == ENFILE || errno == EMFILE)
+ ereport(LOG,
+ (errmsg("system is too busy to write logfile meta
info, %s will be updated on next rotation (or use SIGHUP to retry)",
LOG_METAINFO_DATAFILE)));

This seems like a totally unprincipled reaction to a purely arbitrary
subset of errors. EMFILE or ENFILE doesn't represent a general "too
busy" condition, and logfile_open() has already logged the real error.

+ snprintf(tempfn, sizeof(tempfn), "%s.tmp", LOG_METAINFO_DATAFILE);

You don't really need to use snprintf() here. You could #define
LOG_METAINFO_DATAFILE_TMP LOG_METAINFO_DATAFILE ".tmp" and compute
this at compile time instead of runtime.

+ if (PG_NARGS() == 1) {
+ fmt = PG_ARGISNULL(0) ? NULL : PG_GETARG_TEXT_PP(0);
+ if (fmt != NULL) {
+ logfmt = text_to_cstring(fmt);
+ if ( (strcmp(logfmt, "stderr") != 0) &&
+ (strcmp(logfmt, "csvlog") != 0) ) {
+ ereport(ERROR,
+ (errmsg("log format %s not supported, possible values are
stderr or csvlog", logfmt)));
+ PG_RETURN_NULL();
+ }
+ }
+ } else {
+ logfmt = NULL;
+ }

Formatting. This is unlike PostgreSQL style in multiple ways,
including cuddled braces, extra parentheses and spaces, and use of
braces around a single-line block. Also, this could be written in a
much less contorted way, like:

if (PG_NARGS() == 0 || PG_ARGISNULL(0))
logfmt = NULL;
else
{
logfmt = text_to_cstring(PG_GETARG_TEXT_PP(0));
if (strcmp(logfmt, "stderr") != 0 && strcmp(logfmt, "csvlog") != 0)
ereport(...);
}

Also, the ereport() needs an errcode(), not just an errmsg().
Otherwise it defaults to ERRCODE_INTERNAL_ERROR, which is not correct.

+ /* Check if current log file is present */
+ if (stat(LOG_METAINFO_DATAFILE, &stat_buf) != 0)
+ PG_RETURN_NULL();

Useless test. The code that follows catches this case anyway and
handles it the same way. Which is good, because this has an inherent
race condition. The previous if (!Logging_collector) test sems fairly
useless too; unless there's a bug in your syslogger.c code, the file
won't exist anyway, so we'll return NULL for that reason with no extra
code needed here.

+ /*
+ * Read the file to gather current log filename(s) registered
+ * by the syslogger.
+ */

Whitespace isn't right.

+ while (fgets(lbuffer, sizeof(lbuffer), fd) != NULL) {
+ char log_format[10];
+ int i = 0, space_pos = 0;
+
+ /* Check for a read error. */
+ if (ferror(fd)) {

More coding style issues.

+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not read file \"%s\": %m",
LOG_METAINFO_DATAFILE)));
+ FreeFile(fd);
+ break;

ereport(ERROR) doesn't return, so the code that follows is pointless.

+ if (strchr(lbuffer, '\n') != NULL)
+ *strchr(lbuffer, '\n') = '\0';

Probably worth adding a local variable instead of doing this twice.
Local variables are cheap, and the code would be more readable.

+ if ((space_pos == 0) && (isspace(lbuffer[i]) != 0))

Too many parentheses. Also, I think the whole loop in which this is
contained could be eliminated entirely. Just search for the first ' '
character with strchr(); you don't need to are about isspace() because
you know the code that writes this file uses ' ' specifically.
Overwrite it with '\0'. And then use a pointer to lbuffer for the log
format and a pointer to lbuffer + strchr_result + 1 for the pathname.

+ if ((space_pos != (int)strlen("stderr")) &&
+ (space_pos != (int)strlen("csvlog")))
+ {
+ ereport(ERROR,
+ (errmsg("unexpected line format in file %s",
LOG_METAINFO_DATAFILE)));
+ break;
+ }

I think this is pointless. Validating the length of the log format
but not the content seems kind of silly, and why do either? The only
way it's going to do the wrong thing is if the file contents are
corrupt, and that should only happen if we have a bug. Which we
probably won't, and if we do we'll fix it and then we won't, and this
will just be deadweight -- and one more thing to update if we ever add
support for another log format.

+ /* Return null when csvlog is requested but we have a stderr log */
+ if ( (logfmt != NULL) && (strcmp(logfmt, "csvlog") == 0)
+ && !(Log_destination & LOG_DESTINATION_CSVLOG) )
+ PG_RETURN_NULL();
+
+ /* Return null when stderr is requested but we have a csv log */
+ if ( (logfmt != NULL) && (strcmp(logfmt, "stderr") == 0)
+ && !(Log_destination & LOG_DESTINATION_STDERR) )
+ PG_RETURN_NULL();

Seems complicated. Instead, you could declare char *result = NULL.
When you find the row you're looking for, set result and break out of
the loop. Then here you can just do if (result == NULL)
PG_RETURN_NULL() and this complexity goes away.

+/* in backend/utils/adt/misc.c and backend/postmaster/syslogger.c */
+/*
+ * Name of file holding the paths, names, and types of the files where current
+ * log messages are written, when the log collector is enabled. Useful
+ * outside of Postgres when finding the name of the current log file in the
+ * case of time-based log rotation.
+ */
+#define LOG_METAINFO_DATAFILE "current_logfiles"

syslogger.h seems more appropriate. miscadmin.h is quite widely
included and it would be better not to bloat it with things that
aren't widely needed. It makes sense to use it for widely-used stuff
that has no other obvious home, but since this is clearly a creature
of the syslogger, it could logically go in syslogger.h.

This might not be exhaustive but I think fixing these things would get
us much closer to a committable patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-12-07 23:57:05 Re: Patch to implement pg_current_logfile() function
Previous Message Tom Lane 2016-12-07 23:49:57 Re: Back-patch use of unnamed POSIX semaphores for Linux?