Re: RFC: pg_stat_logmsg

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: RFC: pg_stat_logmsg
Date: 2023-07-06 07:36:13
Message-ID: CAD21AoAyPVYZ_6VAQCOanvHTL3CRSqTDCFrtvaYVGwOQbrnxgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Sat, Jul 1, 2023 at 8:57 AM Joe Conway <mail(at)joeconway(dot)com> wrote:
>
> Greetings,
>
> Attached please find a tarball (rather than a patch) for a proposed new
> contrib extension, pg_stat_logmsg.
>
> The basic idea is to mirror how pg_stat_statements works, except the
> logged messages keyed by filename, lineno, and elevel are saved with a
> aggregate count. The format string is displayed (similar to a query
> jumble) for context, along with function name and sqlerrcode.
>
>
> Part of the thinking is that people with fleets of postgres instances
> can use this to scan for various errors that they care about.
> Additionally it would be useful to look for "should not happen" errors.

Interesting idea and use cases.

I'm concerned that displaying the format string could not be helpful
in some cases. For example, when raising an ERROR while reading WAL
records, we typically write the error message stored in
record->errormsg_buf:

in ReadRecord():
if (errormsg)
ereport(emode_for_corrupt_record(emode, xlogreader->EndRecPtr),
(errmsg_internal("%s", errormsg) /* already
translated */ ));

In this case, the error message stored in pg_stat_logmsg is just '%s'.
The filename and line number columns might help identify the actual
error but it requires users to read the source code and cannot know
the actual error message.

A similar case is where we construct the error message on the fly. For
example, in LogRecoveryConflict() the string of the recovery conflict
description comes from get_recovery_conflict_desc():

in LogRecoveryConflict():
ereport(LOG,
errmsg("recovery still waiting after %ld.%03d ms: %s",
msecs, usecs, get_recovery_conflict_desc(reason)),
nprocs > 0 ? errdetail_log_plural("Conflicting process: %s.",
"Conflicting processes: %s.",
nprocs, buf.data) : 0);

The user might want to search the error message by the actual conflict
reason, but cannot. In this case, I'd like to see the actual error
message (I'd like to normalize the number part, though).

That being said, using the format string for the error messages like
"ERROR: relation "nonexist_table" does not exist" makes sense to me
since we can avoid having too many similar entries.

So I feel that we might need to figure out what part of the log
message should be normalized like pg_stat_statement does for query
strings.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2023-07-06 07:55:14 Re: pgsql: Fix search_path to a safe value during maintenance operations.
Previous Message huchangqi 2023-07-06 07:14:08 Re: Re: [PATCH] Add loongarch native checksum implementation.