Re: Frontend error logging style

From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Daniel Gustafsson <daniel(at)yesql(dot)se>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Frontend error logging style
Date: 2022-03-29 14:38:35
Message-ID: ce4b5fbb-4300-fc8e-db4e-3cb6f68ed2e3@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 29.03.22 12:13, Daniel Gustafsson wrote:

Most of these should probably be addressed separately from Tom's patch.

>> @@ -508,24 +502,15 @@ writefile(char *path, char **lines)
>
>> if (fclose(out_file))
>> - {
>> - pg_log_error("could not write file \"%s\": %m", path);
>> - exit(1);
>> - }
>> + pg_fatal("could not write file \"%s\": %m", path);
>> }
>
> Should we update this message to differentiate it from the fwrite error case?
> Something like "an error occurred during writing.."

Should be "could not close ...", no?

>> @@ -2057,10 +2004,7 @@ check_locale_name(int category, const char *locale, char **canonname)
>>
>> save = setlocale(category, NULL);
>> if (!save)
>> - {
>> - pg_log_error("setlocale() failed");
>> - exit(1);
>> - }
>> + pg_fatal("setlocale() failed");
>
> Should this gain a hint message for those users who have no idea what this
> really means?

My setlocale() man page says:

ERRORS
No errors are defined.

So uh ... ;-)

>> @@ -3089,18 +2979,14 @@ main(int argc, char *argv[])
>> else if (strcmp(optarg, "libc") == 0)
>> locale_provider = COLLPROVIDER_LIBC;
>> else
>> - {
>> - pg_log_error("unrecognized locale provider: %s", optarg);
>> - exit(1);
>> - }
>> + pg_fatal("unrecognized locale provider: %s", optarg);
>
> Should this %s be within quotes to match how we usually emit user-input?

Usually not done after colon, but could be.

>> @@ -1123,9 +1097,9 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
>> pg_log_warning("btree index \"%s.%s.%s\": btree checking function returned unexpected number of rows: %d",
>> rel->datinfo->datname, rel->nspname, rel->relname, ntups);
>> if (opts.verbose)
>> - pg_log_info("query was: %s", rel->sql);
>> - pg_log_warning("Are %s's and amcheck's versions compatible?",
>> - progname);
>> + pg_log_warning_detail("Query was: %s", rel->sql);
>> + pg_log_warning_hint("Are %s's and amcheck's versions compatible?",
>> + progname);
>
> Should "amcheck's" be a %s parameter to make translation reusable (which it
> miht never be) and possibly avoid translation mistake?

We don't have translations set up for contrib modules. Otherwise, this
kind of thing would probably be something to look into.

>> --- a/src/bin/pg_basebackup/receivelog.c
>> +++ b/src/bin/pg_basebackup/receivelog.c
>> @@ -140,7 +140,7 @@ open_walfile(StreamCtl *stream, XLogRecPtr startpoint)
>> /* fsync file in case of a previous crash */
>> if (stream->walmethod->sync(f) != 0)
>> {
>> - pg_log_fatal("could not fsync existing write-ahead log file \"%s\": %s",
>> + pg_log_error("could not fsync existing write-ahead log file \"%s\": %s",
>> fn, stream->walmethod->getlasterror());
>> stream->walmethod->close(f, CLOSE_UNLINK);
>> exit(1);
>
> In the case where we already have IO related errors, couldn't the close() call
> cause an additional error message which potentially could be helpful for
> debugging?

Yeah, I think in general we have been sloppy with reporting file-closing
errors properly. Those presumably happen very rarely, but when they do,
things are probably very bad.

>> @@ -597,31 +570,19 @@ main(int argc, char *argv[])
>
>> if (ControlFile->data_checksum_version == 0 &&
>> mode == PG_MODE_CHECK)
>> - {
>> - pg_log_error("data checksums are not enabled in cluster");
>> - exit(1);
>> - }
>> + pg_fatal("data checksums are not enabled in cluster");

> Fatal seems sort of out place here, it's not really a case of "something
> terrible happened" but rather "the preconditions weren't met". Couldn't these
> be a single pg_log_error erroring out with the reason in a pg_log_detail?

"fatal" means error plus exit, so this seems ok. There is no separate
log level for really-bad-error.

>> @@ -721,7 +721,7 @@ setFilePath(ArchiveHandle *AH, char *buf, const char *relativeFilename)
>> dname = ctx->directory;
>>
>> if (strlen(dname) + 1 + strlen(relativeFilename) + 1 > MAXPGPATH)
>
> Unrelated, but shouldn't that be >= MAXPGPATH?

Seems correct to me as is.

>> @@ -14951,18 +14942,18 @@ createViewAsClause(Archive *fout, const TableInfo *tbinfo)
>
>> - fatal("definition of view \"%s\" appears to be empty (length zero)",
>> - tbinfo->dobj.name);
>> + pg_fatal("definition of view \"%s\" appears to be empty (length zero)",
>> + tbinfo->dobj.name);
>
> I'm not sure we need to provide a definition of empty here, most readers will
> probably understand that already =)

It could mean, contains no columns, or something similar. If I had to
change it, I would remove the "empty" and keep the "length zero".

>> @@ -16602,13 +16593,10 @@ dumpSequence(Archive *fout, const TableInfo *tbinfo)
>> res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
>>
>> if (PQntuples(res) != 1)
>> - {
>> - pg_log_error(ngettext("query to get data of sequence \"%s\" returned %d row (expected 1)",
>> - "query to get data of sequence \"%s\" returned %d rows (expected 1)",
>> - PQntuples(res)),
>> - tbinfo->dobj.name, PQntuples(res));
>> - exit_nicely(1);
>> - }
>> + pg_fatal(ngettext("query to get data of sequence \"%s\" returned %d row (expected 1)",
>> + "query to get data of sequence \"%s\" returned %d rows (expected 1)",
>> + PQntuples(res)),
>> + tbinfo->dobj.name, PQntuples(res));
>
> The ngettext() call seems a bit out of place here since we already know that
> second form will be taken given the check on PQntuples(res).

See
<https://www.postgresql.org/message-id/flat/CALj2ACUfJKTmK5v%3DvF%2BH2iLkqM9Yvjsp6iXaCqAks6gDpzZh6g%40mail.gmail.com>
for a similar case that explains why this is still correct and necessary.

>> @@ -144,16 +145,10 @@ main(int argc, char *argv[])
>> if (alldb)
>> {
>> if (dbname)
>> - {
>> - pg_log_error("cannot cluster all databases and a specific one at the same time");
>> - exit(1);
>> - }
>> + pg_fatal("cannot cluster all databases and a specific one at the same time");
>>
>> if (tables.head != NULL)
>> - {
>> - pg_log_error("cannot cluster specific table(s) in all databases");
>> - exit(1);
>> - }
>> + pg_fatal("cannot cluster specific table(s) in all databases");
>
> An ngettext() candidate perhaps? There are more like this in main() hunks further down omitted for brevity here.

I would just rephrase this as

"cannot cluster specific tables in all databases"

This is still correct and sensible if the user specified just one table.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2022-03-29 14:41:18 Re: Identify missing publications from publisher while create/alter subscription.
Previous Message Peter Eisentraut 2022-03-29 14:24:18 Re: Frontend error logging style