Re: Printing backtrace of postgres processes

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Printing backtrace of postgres processes
Date: 2021-11-14 15:18:52
Message-ID: CALDaNm1SjDNaH=sHyD5emm8+F9f-A4x12Xxare0bJ_9iPmexLA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 12, 2021 at 6:11 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Fri, Nov 12, 2021 at 5:15 PM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > On Thu, Nov 11, 2021 at 12:14 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > > Thanks for the comments, the attached v10 patch has the fixes for the same.
> >
> > Thanks for the patches. Here are some comments:
> >
> > 1) In the docs, let's have the similar description of
> > pg_log_backend_memory_contexts for pg_print_backtrace, just for the
> > continuity in the users readability.
> >
> > 2) I don't know how the <screen> part looks like in the Server
> > Signaling Functions table. I think here you can just say, it will emit
> > a warning and return false if not supported by the installation. And
> > you can give the <screen> part after the description where you are
> > showing a sample backtrace.
> >
> > + capture backtrace. If not available, the function will return false
> > + and a warning is issued, for example:
> > +<screen>
> > +WARNING: backtrace generation is not supported by this installation
> > + pg_print_backtrace
> > +--------------------
> > + f
> > +</screen>
> > + </para></entry>
> > + </row>
> >
> > 3) Replace '!' with '.'.
> > + * Note: this is called within a signal handler! All we can do is set
> >
> > 4) It is not only the next CFI but also the process specific interrupt
> > handlers (in your 0002 patch) right?
> > + * a flag that will cause the next CHECK_FOR_INTERRUPTS to invoke
> >
> > 5) I think you need to update CATALOG_VERSION_NO, mostly the committer
> > will take care of it but just in case.
> >
> > 6) Be consistent with casing "Verify" and "Might"
> > +# Verify that log output gets to the file
> > +# might need to retry if logging collector process is slow...
>
> Few more comments:
>
> 7) Do we need TAP tests for this function? I think it is sufficient to
> test the function in misc_functions.sql, please remove
> 002_print_backtrace_validation.pl. Note that we don't do similar TAP
> testing for pg_log_backend_memory_contexts as well.

I felt let's keep this test case, all the other tests just check if it
returns true or false, it does not checks for the contents in the
logfile. This is the only test which checks the logfile.

> 8) I hope you have manually tested the pg_print_backtrace for the
> startup process and other auxiliary processes.

Yes, I have checked them manually.

> 9) I think we can have a similar description (to the patch [1]). with
> links to each process definition in the glossary so that it will be
> easier for the users to follow the links and know what each process
> is. Especially, I didn't like the 0002 mentioned about the
> BackendPidGetProc or AuxiliaryPidGetProc as these are internal to the
> server and the users don't care about them.
>
> - * end on its own first and its backtrace are not logged is not a problem.
> + * BackendPidGetProc or AuxiliaryPidGetProc returns NULL if the pid isn't
> + * valid; but by the time we reach kill(), a process for which we get a
> + * valid proc here might have terminated on its own. There's no way to
> + * acquire a lock on an arbitrary process to prevent that. But since this
> + * mechanism is usually used to debug a backend running and consuming lots
> + * of CPU cycles, that it might end on its own first and its backtrace are
> + * not logged is not a problem.
> */
>
> Here's what I have written in the other patch [1], if okay please use this:
>
> + Requests to log memory contexts of the backend or the
> + <glossterm linkend="glossary-wal-sender">WAL sender</glossterm> or
> + the <glossterm linkend="glossary-auxiliary-proc">auxiliary
> process</glossterm>
> + with the specified process ID. All of the
> + <glossterm linkend="glossary-auxiliary-proc">auxiliary
> processes</glossterm>
> + are supported except the <glossterm
> linkend="glossary-logger">logger</glossterm>
> + and the <glossterm
> linkend="glossary-stats-collector">statistics collector</glossterm>
> + as they are not connected to shared memory the function can
> not make requests.
> + These memory contexts will be logged at
> <literal>LOG</literal> message level.
> + They will appear in the server log based on the log configuration set
> (See <xref linkend="runtime-config-logging"/> for more information),

I had mentioned BackendPidGetProc or AuxiliaryPidGetProc as comments
in the function, but I have not changed that. I have changed the
documentation similar to your patch.

Thanks for the comments, v11 patch attached at [1] has the changes for the same.
[1] - https://www.postgresql.org/message-id/CALDaNm3BYGOG3-PQvYbWkB%3DG3h1KYJ8CO8UYbzfECH4DYGMGqA%40mail.gmail.com

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-11-14 17:06:46 Re: JIT doing duplicative optimization?
Previous Message vignesh C 2021-11-14 15:16:00 Re: Printing backtrace of postgres processes