Re: [HACKERS] possible self-deadlock window after bad ProcessStartupPacket

From: Asim R P <apraveen(at)pivotal(dot)io>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Jimmy Yih <jyih(at)pivotal(dot)io>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] possible self-deadlock window after bad ProcessStartupPacket
Date: 2018-07-19 00:26:21
Message-ID: CANXE4Tevcwdx-exHGPq22CzDk6KJhfJQfON4wzKj2rNmYdyxpg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 22, 2017 at 10:50 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Or, probably more robust: Simply _exit(2) without further ado, and rely
> on postmaster to output an appropriate error message. Arguably it's not
> actually useful to see hundreds of "WARNING: terminating connection because of
> crash of another server process" messages in the log anyway.
>

To support using _exit(2) in *quickdie() handlers, I would like to
share another stack trace indicating self-deadlock. In this case, WAL
writer process got SIGQUIT while it was already handling a SIGQUIT,
leading to self-deadlock.

#0 __lll_lock_wait_private () at
../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1 0x00007f0bf04db2bd in _int_free (av=0x7f0bf081fb20 <main_arena>,
p=0x1557e60, have_lock=0) at malloc.c:3962
#2 0x00007f0bf04df53c in __GI___libc_free (mem=mem(at)entry=0x1557e70)
at malloc.c:2968
#3 0x00007f0bf0495025 in __run_exit_handlers (status=2,
listp=0x7f0bf081f5f8 <__exit_funcs>,
run_list_atexit=run_list_atexit(at)entry=true)
at exit.c:91
#4 0x00007f0bf0495045 in __GI_exit (status=<optimized out>) at exit.c:104
#5 0x0000000000843994 in wal_quickdie ()
#6 <signal handler called>
#7 0x00007f0bf04db014 in _int_free (av=0x7f0bf081fb20 <main_arena>,
p=<optimized out>, have_lock=0) at malloc.c:4014
#8 0x00007f0bf04df53c in __GI___libc_free (mem=<optimized out>) at
malloc.c:2968
#9 0x00007f0bebf8b2ba in ?? () from /usr/lib/x86_64-linux-gnu/libtasn1.so.6
#10 0x00007f0bebf8c4ba in asn1_delete_structure2 () from
/usr/lib/x86_64-linux-gnu/libtasn1.so.6
#11 0x00007f0beec24738 in ?? () from /usr/lib/x86_64-linux-gnu/libgnutls.so.30
#12 0x00007f0bf3bb6de7 in _dl_fini () at dl-fini.c:235
#13 0x00007f0bf0494ff8 in __run_exit_handlers (status=2,
listp=0x7f0bf081f5f8 <__exit_funcs>,
run_list_atexit=run_list_atexit(at)entry=true)
at exit.c:82
#14 0x00007f0bf0495045 in __GI_exit (status=<optimized out>) at exit.c:104
#15 0x0000000000843994 in wal_quickdie ()
#16 <signal handler called>
#17 0x00007f0bf05585b3 in __select_nocancel () at
../sysdeps/unix/syscall-template.S:84
#18 0x0000000000b7c5da in pg_usleep ()
#19 0x0000000000843c4a in WalWriterMain ()
#20 0x000000000059ac47 in AuxiliaryProcessMain ()

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jamison, Kirk 2018-07-19 00:53:14 RE: Recovery performance of standby for multiple concurrent truncates on large tables
Previous Message Fabien COELHO 2018-07-19 00:23:33 Re: [HACKERS] Re: [COMMITTERS] pgsql: Remove pgbench "progress" test pending solution of its timing is (fwd)