Re: Error message style guide, take 2

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Error message style guide, take 2
Date: 2003-08-16 23:53:36
Message-ID: 200308162353.h7GNraa14257@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


I have added this email to CVS as src/tools/error_text. Any changes to
it?

---------------------------------------------------------------------------

Tom Lane wrote:
> I'm about to start going through the backend's elog() calls to update
> them to ereport() style, add error code numbers, polish wording, etc.
> So it's time to nail down our style guide for message wording. Attached
> is a revision of the draft that Peter posted on 14-March. Any further
> comments?
>
> BTW, I'd like to SGML-ify this and put it into the developer's guide
> somewhere; any thoughts where exactly?
>
> regards, tom lane
>
>
>
> What goes where
> ---------------
>
> The primary message should be short, factual, and avoid reference to
> implementation details such as specific function names. "Short" means
> "should fit on one line under normal conditions". Use a detail message if
> needed to keep the primary message short, or if you feel a need to mention
> implementation details such as the particular system call that failed.
> Both primary and detail messages should be factual. Use a hint message
> for suggestions about what to do to fix the problem, especially if the
> suggestion might not always be applicable.
>
> For example, instead of
> IpcMemoryCreate: shmget(key=%d, size=%u, 0%o) failed: %m
> (plus a long addendum that is basically a hint)
> write
> Primary: Could not create shared memory segment: %m
> Detail: Failed syscall was shmget(key=%d, size=%u, 0%o)
> Hint: the addendum
>
> RATIONALE: keeping the primary message short helps keep it to the point,
> and lets clients lay out screen space on the assumption that one line is
> enough for error messages. Detail and hint messages may be relegated to a
> verbose mode, or perhaps a pop-up error-details window. Also, details and
> hints would normally be suppressed from the server log to save space.
> Reference to implementation details is best avoided since users don't know
> the details anyway.
>
>
> Formatting
> ----------
>
> Don't put any specific assumptions about formatting into the message
> texts. Expect clients and the server log to wrap lines to fit their own
> needs. In long messages, newline characters (\n) may be used to indicate
> suggested paragraph breaks. Don't end a message with a newline. Don't
> use tabs or other formatting characters. (In error context displays,
> newlines are automatically added to separate levels of context such
> as function calls.)
>
> RATIONALE: Messages are not necessarily displayed on terminal-type
> displays. In GUI displays or browsers these formatting instructions
> are at best ignored.
>
>
> Quotation marks
> ---------------
>
> English text should use double quotes when quoting is appropriate.
> Text in other languages should consistently use one kind of quotes
> that is consistent with publishing customs and computer output of
> other programs.
>
> RATIONALE: The choice of double quotes over single quotes is somewhat
> arbitrary, but tends to be the preferred use. Some have suggested
> choosing the kind of quotes depending on the type of object according to
> SQL conventions (namely, strings single quoted, identifiers double
> quoted). But this is a language-internal technical issue that many users
> aren't even familiar with, it won't scale to other kinds of quoted terms,
> it doesn't translate to other languages, and it's pretty pointless, too.
>
>
> Use of quotes
> -------------
>
> Use quotes always to delimit file names, user-supplied identifiers,
> and other variables that might contain words. Do not use them to
> mark up variables that will not contain words (for example, operator
> names).
>
> There are functions in the backend that will double-quote their own
> output at need (for example, format_type_be()). Do not put additional
> quotes around the output of such functions.
>
> RATIONALE: Objects can have names that create ambiguity when embedded
> in a message. Be consistent about denoting where a plugged-in name
> starts and ends. But don't clutter messages with unnecessary or
> duplicate quote marks.
>
>
> Grammar and punctuation
> -----------------------
>
> The rules are different for primary error messages and for detail/hint
> messages:
>
> Primary error messages: Do not capitalize the first letter. Do not end a
> message with a period. Do not even think about ending a message with an
> exclamation point.
>
> Detail and hint messages: Use complete sentences, and end each with
> a period. Capitalize the starts of sentences.
>
> RATIONALE: Avoiding punctuation makes it easier for client applications to
> embed the message into a variety of grammatical contexts. Often, primary
> messages are not grammatically complete sentences anyway. (And if they're
> long enough to be more than one sentence, they should be split into
> primary and detail parts.) However, detail and hint messages are longer
> and may need to include multiple sentences. For consistency, they should
> follow complete-sentence style even when there's only one sentence.
>
>
> Upper case vs. lower case
> -------------------------
>
> Use lower case for message wording, including the first letter of a
> primary error message. Use upper case for SQL commands and key words if
> they appear in the message.
>
> RATIONALE: It's easier to make everything look more consistent this
> way, since some messages are complete sentences and some not.
>
>
> Avoid passive voice
> -------------------
>
> Use the active voice. Use complete sentences when there is an acting
> subject ("A could not do B"). Use telegram style without subject if
> the subject would be the program itself; do not use "I" for the
> program.
>
> RATIONALE: The program is not human. Don't pretend otherwise.
>
>
> Present vs past tense
> ---------------------
>
> Use past tense if an attempt to do something failed, but could perhaps
> succeed next time (perhaps after fixing some problem). Use present tense
> if the failure is certainly permanent.
>
> There is a nontrivial semantic difference between sentences of the
> form
>
> could not open file "%s": %m
>
> and
>
> cannot open file "%s"
>
> The first one means that the attempt to open the file failed. The
> message should give a reason, such as "disk full" or "file doesn't
> exist". The past tense is appropriate because next time the disk
> might not be full anymore or the file in question may exist.
>
> The second form indicates the the functionality of opening the named
> file does not exist at all in the program, or that it's conceptually
> impossible. The present tense is appropriate because the condition
> will persist indefinitely.
>
> RATIONALE: Granted, the average user will not be able to draw great
> conclusions merely from the tense of the message, but since the
> language provides us with a grammar we should use it correctly.
>
>
> Type of the object
> ------------------
>
> When citing the name of an object, state what kind of object it is.
>
> RATIONALE: Else no one will know what "foo.bar.baz" is.
>
>
> Brackets
> --------
>
> Square brackets are only to be used (1) in command synopses to denote
> optional arguments, or (2) to denote an array subscript.
>
> RATIONALE: Anything else does not correspond to widely-known customary
> usage and will confuse people.
>
>
> Assembling error messages
> -------------------------
>
> When a message includes text that is generated elsewhere, embed it in
> this style:
>
> could not open file %s: %m
>
> RATIONALE: It would be difficult to account for all possible error codes
> to paste this into a single smooth sentence, so some sort of punctuation
> is needed. Putting the embedded text in parentheses has also been
> suggested, but it's unnatural if the embedded text is likely to be the
> most important part of the message, as is often the case.
>
>
> Reasons for errors
> ------------------
>
> Messages should always state the reason why an error occurred.
> For example:
>
> BAD: could not open file %s
> BETTER: could not open file %s (I/O failure)
>
> If no reason is known you better fix the code. ;-)
>
>
> Function names
> --------------
>
> Don't include the name of the reporting routine in the error text.
> We have other mechanisms for finding that out when needed, and for
> most users it's not helpful information. If the error text doesn't
> make as much sense without the function name, reword it.
>
> BAD: pg_atoi: error in "z": can't parse "z"
> BETTER: invalid input syntax for integer: "z"
>
> Avoid mentioning called function names, either; instead say what the code
> was trying to do:
>
> BAD: open() failed: %m
> BETTER: could not open file %s: %m
>
> If it really seems necessary, mention the system call in the detail
> message. (In some cases, providing the actual values passed to the
> system call might be appropriate information for the detail message.)
>
> RATIONALE: Users don't know what all those functions do.
>
>
> Tricky words to avoid
> ---------------------
>
> unable:
>
> "unable" is nearly the passive voice. Better use "cannot" or "could
> not", as appropriate.
>
> bad:
>
> Error messages like "bad result" are really hard to interpret
> intelligently. It's better to write why the result is "bad", e.g.,
> "invalid format".
>
> illegal:
>
> "Illegal" stands for a violation of the law, the rest is "invalid".
> Better yet, say why it's invalid.
>
> unknown:
>
> Try to avoid "unknown". Consider "error: unknown response". If you
> don't know what the response is, how do you know it's erroneous?
> "Unrecognized" is often a better choice. Also, be sure to include the
> value being complained of.
>
> BAD: unknown node type
> BETTER: unrecognized node type: 42
>
> find vs. exists:
>
> If the program uses a nontrivial algorithm to locate a resource (e.g., a
> path search) and that algorithm fails, it is fair to say that the program
> couldn't "find" the resource. If, on the other hand, the expected
> location of the resource is known but the program cannot access it there
> then say that the resource doesn't "exist". Using "find" in this case
> sounds weak and confuses the issue.
>
>
> Proper spelling
> ---------------
>
> Spell out words in full. For instance, avoid:
>
> spec
> stats
> parens
> auth
> xact
>
> RATIONALE: This will improve consistency.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-08-17 00:11:08 Re: Error message style guide, take 2
Previous Message Bruce Momjian 2003-08-16 23:33:34 Re: compile error on cvs tip