Re: Printing backtrace of postgres processes

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: vignesh21(at)gmail(dot)com, robertmhaas(at)gmail(dot)com, pryzby(at)telsasoft(dot)com, stark(at)mit(dot)edu, torikoshia(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Printing backtrace of postgres processes
Date: 2022-11-11 12:34:41
Message-ID: CALj2ACUNZVB0cQovvKBd53-upsMur8j-5_K=-fg86uAa+WYEWg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 11, 2022 at 7:59 AM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> At Thu, 10 Nov 2022 15:56:35 +0530, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote in
> > On Mon, Apr 18, 2022 at 9:10 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > >
> > > The attached v21 patch has the changes for the same.
> >
> > Thanks for the patch. Here are some comments:
> >
> > 1. I think there's a fundamental problem with this patch, that is, it
> > prints the backtrace when the interrupt is processed but not when
> > interrupt is received. This way, the backends/aux processes will
>
> Yeah, but the obstacle was backtrace(3) itself. Andres pointed [1]
> that that may be doable with some care (and I agree to the opinion).
> AFAIS no discussions followed and things have been to the current
> shape since then.
>
>
> [1] https://www.postgresql.org/message-id/20201201032649.aekv5b5dicvmovf4%40alap3.anarazel.de
> | > Surely this is *utterly* unsafe. You can't do that sort of stuff in
> | > a signal handler.
> |
> | That's of course true for the current implementation - but I don't think
> | it's a fundamental constraint. With a bit of care backtrace() and
> | backtrace_symbols() itself can be signal safe:
>
> man 3 backtrace
> > * backtrace() and backtrace_symbols_fd() don't call malloc() explic‐
> > itly, but they are part of libgcc, which gets loaded dynamically
> > when first used. Dynamic loading usually triggers a call to mal‐
> > loc(3). If you need certain calls to these two functions to not
> > allocate memory (in signal handlers, for example), you need to make
> > sure libgcc is loaded beforehand.

I missed that part. Thanks for pointing it out. The
backtrace_symbols() seems to be returning a malloc'ed array [1],
meaning it can't be used in signal handlers (if used, it can cause
deadlocks as per [2]) and existing set_backtrace() is using it.
Therefore, we need to either change set_backtrace() to use
backtrace_symbols_fd() instead of backtrace_symobls() or introduce
another function for the purpose of this feature. If done that, then
we can think of preloading of libgcc which makes backtrace(),
backtrace_symobols_fd() safe to use in signal handlers.

Looks like we're not loading libgcc explicitly now into any of
postgres processes, please correct me if I'm wrong here. If we're not
loading it right now, is it acceptable to load libgcc into every
postgres process for the sake of this feature?

[1] https://linux.die.net/man/3/backtrace_symbols
[2] https://stackoverflow.com/questions/40049751/malloc-inside-linux-signal-handler-cause-deadlock

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Borisov 2022-11-11 12:39:10 Re: Lockless queue of waiters in LWLock
Previous Message Maxim Orlov 2022-11-11 12:21:58 Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures