Re: Patch to implement pg_current_logfile() function

From: Gilles Darold <gilles(dot)darold(at)dalibo(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Karl O(dot) Pinc" <kop(at)meme(dot)com>, Christoph Berg <myon(at)debian(dot)org>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch to implement pg_current_logfile() function
Date: 2016-12-09 22:41:25
Message-ID: 24ebcb40-1ec7-9148-03db-ac498bb0cc7f@dalibo.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Le 08/12/2016 à 00:52, Robert Haas a écrit :
> On Tue, Dec 6, 2016 at 6:11 PM, Gilles Darold <gilles(dot)darold(at)dalibo(dot)com> wrote:
>> It seems that all fixes have been included in this patch.
> Yikes. This patch has certainly had a lot of review, but it has lots
> of basic inconsistencies with PostgreSQL coding style, which one would
> think somebody would have noticed by now, and there are multiple
> places where the logic seems to do in a complicated way something that
> could be done significantly more simply. I don't have any objection
> to the basic idea of this patch, but it's got to look and feel like
> the surrounding PostgreSQL code. And not be unnecessarily
> complicated.
>
> Detailed comments below:
>
> The SGML doesn't match the surrounding style - for example, <para>
> typically appears on a line by itself.
Fixed.

> + if (!Logging_collector) {
>
> Formatting.
Fixed.

> rm_log_metainfo() could be merged into logfile_writename(), since
> they're always called in conjunction and in the same pattern. The
> function is poorly named; it should be something like
> update_metainfo_datafile().
Merge and logfile_writename() function renamed as suggested.

> + if (errno == ENFILE || errno == EMFILE)
> + ereport(LOG,
> + (errmsg("system is too busy to write logfile meta
> info, %s will be updated on next rotation (or use SIGHUP to retry)",
> LOG_METAINFO_DATAFILE)));
>
> This seems like a totally unprincipled reaction to a purely arbitrary
> subset of errors. EMFILE or ENFILE doesn't represent a general "too
> busy" condition, and logfile_open() has already logged the real error.
Removed.

> + snprintf(tempfn, sizeof(tempfn), "%s.tmp", LOG_METAINFO_DATAFILE);
>
> You don't really need to use snprintf() here. You could #define
> LOG_METAINFO_DATAFILE_TMP LOG_METAINFO_DATAFILE ".tmp" and compute
> this at compile time instead of runtime.
Done and added to syslogger.h

> + if (PG_NARGS() == 1) {
> + fmt = PG_ARGISNULL(0) ? NULL : PG_GETARG_TEXT_PP(0);
> + if (fmt != NULL) {
> + logfmt = text_to_cstring(fmt);
> + if ( (strcmp(logfmt, "stderr") != 0) &&
> + (strcmp(logfmt, "csvlog") != 0) ) {
> + ereport(ERROR,
> + (errmsg("log format %s not supported, possible values are
> stderr or csvlog", logfmt)));
> + PG_RETURN_NULL();
> + }
> + }
> + } else {
> + logfmt = NULL;
> + }
>
> Formatting. This is unlike PostgreSQL style in multiple ways,
> including cuddled braces, extra parentheses and spaces, and use of
> braces around a single-line block. Also, this could be written in a
> much less contorted way, like:
>
> if (PG_NARGS() == 0 || PG_ARGISNULL(0))
> logfmt = NULL;
> else
> {
> logfmt = text_to_cstring(PG_GETARG_TEXT_PP(0));
> if (strcmp(logfmt, "stderr") != 0 && strcmp(logfmt, "csvlog") != 0)
> ereport(...);
> }
Applied.

> Also, the ereport() needs an errcode(), not just an errmsg().
> Otherwise it defaults to ERRCODE_INTERNAL_ERROR, which is not correct.
Add errcode(ERRCODE_INVALID_PARAMETER_VALUE) to the ereport call.

> + /* Check if current log file is present */
> + if (stat(LOG_METAINFO_DATAFILE, &stat_buf) != 0)
> + PG_RETURN_NULL();
>
> Useless test. The code that follows catches this case anyway and
> handles it the same way. Which is good, because this has an inherent
> race condition. The previous if (!Logging_collector) test sems fairly
> useless too; unless there's a bug in your syslogger.c code, the file
> won't exist anyway, so we'll return NULL for that reason with no extra
> code needed here.
That's right, both test have been removed.

> + /*
> + * Read the file to gather current log filename(s) registered
> + * by the syslogger.
> + */
>
> Whitespace isn't right.
>
> + while (fgets(lbuffer, sizeof(lbuffer), fd) != NULL) {
> + char log_format[10];
> + int i = 0, space_pos = 0;
> +
> + /* Check for a read error. */
> + if (ferror(fd)) {
>
> More coding style issues.
>
> + ereport(ERROR,
> + (errcode_for_file_access(),
> + errmsg("could not read file \"%s\": %m",
> LOG_METAINFO_DATAFILE)));
> + FreeFile(fd);
> + break;
>
> ereport(ERROR) doesn't return, so the code that follows is pointless.
All issues above are fixed.

> + if (strchr(lbuffer, '\n') != NULL)
> + *strchr(lbuffer, '\n') = '\0';
>
> Probably worth adding a local variable instead of doing this twice.
> Local variables are cheap, and the code would be more readable.
Removed

> + if ((space_pos == 0) && (isspace(lbuffer[i]) != 0))
>
> Too many parentheses. Also, I think the whole loop in which this is
> contained could be eliminated entirely. Just search for the first ' '
> character with strchr(); you don't need to are about isspace() because
> you know the code that writes this file uses ' ' specifically.
> Overwrite it with '\0'. And then use a pointer to lbuffer for the log
> format and a pointer to lbuffer + strchr_result + 1 for the pathname.
Rewritten, not exactly the way you describe but like this:

/* extract log format and log file path from the line */
log_filepath = strchr(lbuffer, ' ');
log_filepath++;
lbuffer[log_filepath-lbuffer-1] = '\0';
log_format = lbuffer;
*strchr(log_filepath, '\n') = '\0';

it was possible to not use the additional log_format variable and just
keep the log format information in lbuffer, then use it the next code,
but I have kept the log_format variable for readability.

> + if ((space_pos != (int)strlen("stderr")) &&
> + (space_pos != (int)strlen("csvlog")))
> + {
> + ereport(ERROR,
> + (errmsg("unexpected line format in file %s",
> LOG_METAINFO_DATAFILE)));
> + break;
> + }
>
> I think this is pointless. Validating the length of the log format
> but not the content seems kind of silly, and why do either? The only
> way it's going to do the wrong thing is if the file contents are
> corrupt, and that should only happen if we have a bug. Which we
> probably won't, and if we do we'll fix it and then we won't, and this
> will just be deadweight -- and one more thing to update if we ever add
> support for another log format.
Removed.

> + /* Return null when csvlog is requested but we have a stderr log */
> + if ( (logfmt != NULL) && (strcmp(logfmt, "csvlog") == 0)
> + && !(Log_destination & LOG_DESTINATION_CSVLOG) )
> + PG_RETURN_NULL();
> +
> + /* Return null when stderr is requested but we have a csv log */
> + if ( (logfmt != NULL) && (strcmp(logfmt, "stderr") == 0)
> + && !(Log_destination & LOG_DESTINATION_STDERR) )
> + PG_RETURN_NULL();
>
> Seems complicated. Instead, you could declare char *result = NULL.
> When you find the row you're looking for, set result and break out of
> the loop. Then here you can just do if (result == NULL)
> PG_RETURN_NULL() and this complexity goes away.
Fixed.

> +/* in backend/utils/adt/misc.c and backend/postmaster/syslogger.c */
> +/*
> + * Name of file holding the paths, names, and types of the files where current
> + * log messages are written, when the log collector is enabled. Useful
> + * outside of Postgres when finding the name of the current log file in the
> + * case of time-based log rotation.
> + */
> +#define LOG_METAINFO_DATAFILE "current_logfiles"
>
> syslogger.h seems more appropriate. miscadmin.h is quite widely
> included and it would be better not to bloat it with things that
> aren't widely needed. It makes sense to use it for widely-used stuff
> that has no other obvious home, but since this is clearly a creature
> of the syslogger, it could logically go in syslogger.h.
Moved to syslogger.h

> This might not be exhaustive but I think fixing these things would get
> us much closer to a committable patch.

I have also run the code through pgindent as Tom suggest to fix all
pending coding style issues. I've just change on line from pgindent
work to prevent git apply to complain about a space in tabulation
issue. I don't know which is the best, let things like pgindent want to
write it or prevent git to complain. Attached is the v17 of the patch.

Thanks a lot for this synthetic review.

Regards

--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

Attachment Content-Type Size
patch_pg_current_logfile-v17.diff text/x-diff 19.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Keith Fiske 2016-12-09 22:55:29 Re: [COMMITTERS] pgsql: Implement table partitioning.
Previous Message Kevin Grittner 2016-12-09 22:37:04 Re: [OSSTEST PATCH 0/1] PostgreSQL db: Retry on constraint violation