Re: Printing backtrace of postgres processes

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Printing backtrace of postgres processes
Date: 2021-02-01 00:44:12
Message-ID: CALj2ACXybys_tzWeatb9htVTuxKp29VErc6iJQYJBbdBZXv2fg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 29, 2021 at 7:10 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > 4) How about following
> > + errmsg("must be a superuser to print backtrace
> > of backend process")));
> > instead of
> > + errmsg("must be a superuser to print backtrace
> > of superuser query process")));
> >
>
> Here the message should include superuser, we cannot remove it. Non
> super user can log non super user provided if user has permissions for
> it.
>
> > 5) How about following
> > errmsg("must be a member of the role whose backed
> > process's backtrace is being printed or member of
> > pg_signal_backend")));
> > instead of
> > + errmsg("must be a member of the role whose
> > backtrace is being logged or member of pg_signal_backend")));
> >
>
> Modified it.

Maybe I'm confused here to understand the difference between
SIGNAL_BACKEND_NOSUPERUSER and SIGNAL_BACKEND_NOPERMISSION macros and
corresponding error messages. Some clarification/use case to know in
which scenarios we hit those error messages might help me. Did we try
to add test cases that show up these error messages for
pg_print_backtrace? If not, can we add?

> Attached v5 patch has the fixes for the same.
> Thoughts?

Thanks. Here are some comments on v5 patch:

1) typo - it's "process" not "porcess" + a backend porcess. For example:

2) select * from pg_print_backtrace(NULL);
postgres=# select proname, proisstrict from pg_proc where proname =
'pg_print_backtrace';
proname | proisstrict
--------------------+-------------
pg_print_backtrace | t

See the documentation:
"proisstrict bool

Function returns null if any call argument is null. In that case the
function won't actually be called at all. Functions that are not
“strict” must be prepared to handle null inputs."
So below PG_ARGISNUL check is not needed, you can remove that, because
pg_print_backtrace will not get called with null input.
int bt_pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);

3) Can we just set results = true instead of PG_RETURN_BOOL(true); so
that it will be returned from PG_RETURN_BOOL(result); just for
consistency?
if (!SendProcSignal(bt_pid, PROCSIG_BACKTRACE_PRINT,
InvalidBackendId))
PG_RETURN_BOOL(true);
else
ereport(WARNING,
(errmsg("failed to send signal to postmaster: %m")));
}

PG_RETURN_BOOL(result);

4) Below is what happens if I request for a backtrace of the
postmaster process? 1388210 is pid of postmaster.
postgres=# select * from pg_print_backtrace(1388210);
WARNING: PID 1388210 is not a PostgreSQL server process
pg_print_backtrace
--------------------
f

Does it make sense to have a postmaster's backtrace? If yes, can we
have something like below in sigusr1_handler()?
if (CheckPostmasterSignal(PMSIGNAL_BACKTRACE_EMIT))
{
LogBackTrace();
}

5) Can we have PROCSIG_PRINT_BACKTRACE instead of
PROCSIG_BACKTRACE_PRINT, just for readability and consistency with
function name?

6) I think it's not the postmaster that prints backtrace of the
requested backend and we don't send SIGUSR1 with
PMSIGNAL_BACKTRACE_EMIT to postmaster, I think it's the backends, upon
receiving SIGUSR1 with PMSIGNAL_BACKTRACE_EMIT signal, log their own
backtrace. Am I missing anything here? If I'm correct, we need to
change the below description and other places wherever we refer to
this description.

The idea here is to implement & expose pg_print_backtrace function, internally
what this function does is, the connected backend will send SIGUSR1 signal by
setting PMSIGNAL_BACKTRACE_EMIT to the postmaster process. Once the process
receives this signal it will log the backtrace of the process.

7) Can we get the parallel worker's backtrace? IIUC it's possible.

8) What happens if a user runs pg_print_backtrace() on Windows or
MacOS or some other platform? Do you want to say something about other
platforms where gdb/addr2line doesn't exist?
+</programlisting>
+ You can get the file name and line number by using gdb/addr2line in
+ linux platforms, as a prerequisite users must ensure gdb/addr2line is
+ already installed:
+<programlisting>

9) Can't we reuse set_backtrace with just adding a flag to
set_backtrace(ErrorData *edata, int num_skip, bool
is_print_backtrace_function), making it a non-static function and call
set_backtrace(NULL, 0, true)?

void
set_backtrace(ErrorData *edata, int num_skip, bool is_print_backtrace_function)
{
StringInfoData errtrace;
-------
-------
if (is_print_backtrace_function)
elog(LOG_SERVER_ONLY, "current backtrace:%s", errtrace.data);
else
edata->backtrace = errtrace.data;
}

I think it will be good if we do this, because we can avoid duplicate
code in set_backtrace and LogBackTrace.

10) I think it's "pg_signal_backend" instead of "pg_print_backtrace"?
+ backtrace is being printed or the calling role has been granted
+ <literal>pg_print_backtrace</literal>, however only superusers can

11) In the documentation added, isn't it good if we talk a bit about
in which scenarios users can use this function? For instance,
something like "Use pg_print_backtrace to know exactly where it's
currently waiting when a backend process hangs."?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2021-02-01 01:18:12 Re: Single transaction in the tablesync worker?
Previous Message Euler Taveira 2021-02-01 00:23:04 Re: row filtering for logical replication