Re: [BUGS] BUG #6184: Inconsistencies in log messages

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Ding Yuan <ding(dot)yuan(dot)90(at)hotmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUGS] BUG #6184: Inconsistencies in log messages
Date: 2012-08-15 22:42:51
Message-ID: 23384.1345070571@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> [ moved to hackers]

> Would someone review this report for me? I think there are few
> reasonable changes in here, but I am not familiar enough with all the
> code areas.

I see one place where we forgot to sync a comment, but otherwise this
all looks OK. Details below ...

>> ================== Report 1 ========================
>> The two elog messages are of similar situations but with different verbosity
>> levels.

The inconsistency here is intentional: it's okay for Mark to be a no-op,
as long as you don't ever do a Restore.

>> ======================== Report 2
>> ==========================================
>> The verbosity level for error return for CreateEvent is not consistent!

I'm not too familiar with the Windows-specific code, but I don't see
anything particularly wrong here. Failing to initialize in a wait for
an individual socket is perhaps recoverable, whereas failing to set up
the signal-handling emulation at all is not.

>> ======================== Report 3
>> ==========================================
>> The log message in the first snippet (line 3487) has ERROR level, while
>> PANIC
>> is logged in other similar situations.

They're not really similar since one case is mainline code while the
others are log replay. Actually, I think the mainline case is inside a
critical section so it's not going to matter too much anyway; and for
that matter there's no practical difference between ERROR and PANIC in
replay either. But by and large we've coded PANIC explicitly in log
replay, but not in initial creation/application of WAL records.

>> ======================== Report 4
>> ==========================================
>> There is no error message in the 2nd snippet, when:
>> 3089: if (!SplitIdentifierString(rawname, ',', &namelist))
>> occurred, while there is one at line 62 under similar situations.

[ shrug... ] It's not always the case that errors are to be reported
immediately where detected. In this particular case, the second
instance is inside a GUC check_hook, which is specifically required
to not throw errors, but instead return a failure indication to its
caller.

>> ======================== Report 5
>> ==========================================
>> The first log message at line 2221 is of ERROR level, while the other log
>> messages in similar situations are FATAL!

I think probably the point here is that erroring out of transaction
start is recoverable (you are still not in a transaction, and you could
try again) but erroring out of a later state transition is not going to
leave the transaction mechanism in a sane state. In any case, all of
those are "can't happen" errors so it's hard to get very excited about
them.

>> ======================== Report 6
>> ==========================================
>> The verbosity level of log messages are inconsistent!

Again, one of these is mainline code while the others are in recovery
context.

>> ======================== Report 7
>> ==========================================
>> The log message at line 3050 and line 3142 have inconsistent verbosity
>> levels!

The inconsistency here is a consequence of commit
82748bc253e00a07f0fcce3959af711b509a4e69, so it's entirely intentional,
and presumably a result of discussion in the mailing lists (I'm too lazy
to go look for the thread right now).

I notice that that commit failed to adjust the adjacent comment, though
--- tut tut, Bruce.

>> ======================== Report 8
>> ==========================================
>> The log message at 2292 is ERROR, inconsistent with the line @line 2521
>> (PANIC).

Failure of advance creation of an xlog file is recoverable, not having
one when we need it is not. Also see header comment for XLogFileInit.

>> ======================== Report 9
>> ==========================================
>> The 2 log messages: 624(at)comment(dot)c and 663(at)comment(dot)c have different
>> verbosity
>> levels!

Again, note the comment, which shows pretty clearly that this choice was
intentional. I'm not going to do a git blame for this one but you could
certainly track back where the code was changed to make it a WARNING.

>> ======================== Report 10
>> ==========================================
>> The first log message has WARNING while the others have ERROR!

Yup. The first one is in postgresql.conf parsing and the others are in
command line option parsing. We don't start up if command line options
are bad ... but once running, a mistake in postgresql.conf shouldn't
bring down the database.

On the whole, my impression is that this type of analysis might be
productive, but it would require a great deal more context awareness
(and knowledge of project-specific coding rules) to get it to the
point where it's not almost all false positives.

I doubt that there are very many mistakes in assignment of
NOTICE/WARNING/ERROR levels in our code. There's a certain lack of
consistency in use of ERROR/FATAL/PANIC in contexts where the coder
knows they're all the same anyway, namely just about anything that
runs in a non-transactional process. And the DEBUGn levels are all over
the map, because there's never been any global vision of which debug
messages are most likely to be interesting. I'm not sure if a tool
like this could help us enforce a policy about that ... but first
we'd have to have a policy ...

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Bruce Momjian 2012-08-15 23:06:12 Re: psql doesn't reuse -p after backend fail
Previous Message Bruce Momjian 2012-08-15 21:45:42 Re: [BUGS] BUG #6184: Inconsistencies in log messages

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2012-08-15 23:11:42 Re: [BUGS] BUG #6184: Inconsistencies in log messages
Previous Message Bruce Momjian 2012-08-15 22:30:40 Re: [COMMITTERS] pgsql: Clean up the #include mess a little.