Re: [PATCH] Fix Proposal - Deadlock Issue in Single User Mode When IO Failure Occurs

From: Jesse Zhang <sbjesse(at)gmail(dot)com>
To: chengyu(at)microsoft(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Fix Proposal - Deadlock Issue in Single User Mode When IO Failure Occurs
Date: 2018-12-01 17:54:44
Message-ID: CAGf+fX5r=+gdpYtFKqRA92+guFv7+3cUu6R3cPssQJm8xOHiuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hey Chengyu,
How did you set up your GDB to get "p" to pretty-print a Postgres list?

Cheers,
Jesse

On Fri, Nov 30, 2018 at 1:00 PM Chengchao Yu <chengyu(at)microsoft(dot)com> wrote:

> Greetings,
>
>
>
> Recently, we hit a few occurrences of deadlock when IO failure (including
> disk full, random remote disk IO failures) happens in single user mode. We
> found the issue exists on both Linux and Windows in multiple postgres
> versions.
>
>
>
> Here are the steps to repro on Linux (as Windows repro is similar):
>
>
>
> 1. Get latest PostgreSQL code, build and install the executables.
>
>
>
> $ git clone https://git.postgresql.org/git/postgresql.git
>
> $ cd postgresql
>
> $ PGROOT=$(pwd)
>
> $ git checkout REL_11_STABLE
>
> $ mkdir build
>
> $ cd build
>
> $ ../configure --prefix=/path/to/postgres
>
> $ make && make install
>
>
>
> 2. Run initdb to initialize a PG database folder.
>
>
>
> $ /path/to/postgres/bin/initdb -D /path/to/data
>
>
>
> 3. Because the unable to write relation data scenario is difficult
> to hit naturally even reserved space is turned off, I have prepared a small
> patch (see attachment “emulate-error.patch”) to force an error when PG
> tries to write data to relation files. We can just apply the patch and
> there is no need to put efforts flooding data to disk any more.
>
>
>
> $ cd $PGROOT
>
> $ git apply /path/to/emulate-error.patch
>
> $ cd build
>
> $ make && make install
>
>
>
> 4. Connect to the newly initialized database cluster with single
> user mode, create a table, and insert some data to the table, do a
> checkpoint or directly give EOF. Then we hit the deadlock issue and the
> process will not exit until we kill it.
>
>
>
> Do a checkpoint explicitly:
>
>
>
> $ /path/to/postgres/bin/postgres --single -D /path/to/data/ postgres -c
> exit_on_error=true <<EOF
>
> > create table t1(a int);
>
> > insert into t1 values (1), (2), (3);
>
> > checkpoint;
>
> > EOF
>
>
>
> PostgreSQL stand-alone backend 11.1
>
> backend> backend> backend> 2018-11-29 02:45:27.891 UTC [18806] FATAL:
> Emulate exception in mdwrite() when writing to disk
>
> 2018-11-29 02:55:27.891 UTC [18806] CONTEXT: writing block 8 of relation
> base/12368/1247
>
> 2018-11-29 02:55:27.891 UTC [18806] STATEMENT: checkpoint;
>
>
>
> 2018-11-29 02:55:27.900 UTC [18806] FATAL: Emulate exception in mdwrite()
> when writing to disk
>
> 2018-11-29 02:55:27.900 UTC [18806] CONTEXT: writing block 8 of relation
> base/12368/1247
>
>
>
> Or directly give an EOF:
>
>
>
> $ /path/to/postgres/bin/postgres --single -D /path/to/data/ postgres -c
> exit_on_error=true <<EOF
>
> > create table t1(a int);
>
> > insert into t1 values (1), (2), (3);
>
> > EOF
>
>
>
> PostgreSQL stand-alone backend 11.1
>
> backend> backend> backend> 2018-11-29 02:55:24.438 UTC [18149] FATAL:
> Emulate exception in mdwrite() when writing to disk
>
> 2018-11-29 02:45:24.438 UTC [18149] CONTEXT: writing block 8 of relation
> base/12368/1247
>
>
>
> 5. Moreover, when we try to recover the database with single user
> mode, we hit the issue again, and the process does not bring up the
> database nor exit.
>
>
>
> $ /path/to/postgres/bin/postgres --single -D /path/to/data/ postgres -c
> exit_on_error=true
>
> 2018-11-29 02:59:33.257 UTC [19058] LOG: database system shutdown was
> interrupted; last known up at 2018-11-29 02:58:49 UTC
>
> 2018-11-29 02:59:33.485 UTC [19058] LOG: database system was not properly
> shut down; automatic recovery in progress
>
> 2018-11-29 02:59:33.500 UTC [19058] LOG: redo starts at 0/1672E40
>
> 2018-11-29 02:59:33.500 UTC [19058] LOG: invalid record length at
> 0/1684B90: wanted 24, got 0
>
> 2018-11-29 02:59:33.500 UTC [19058] LOG: redo done at 0/1684B68
>
> 2018-11-29 02:59:33.500 UTC [19058] LOG: last completed transaction was
> at log time 2018-11-29 02:58:49.856663+00
>
> 2018-11-29 02:59:33.547 UTC [19058] FATAL: Emulate exception in mdwrite()
> when writing to disk
>
> 2018-11-29 02:59:33.547 UTC [19058] CONTEXT: writing block 8 of relation
> base/12368/1247
>
>
>
> Analyses:
>
>
>
> So, what happened? Actually, there are 2 types of the deadlock due to the
> same root cause. Let’s first take a look at the scenario in step #5. In
> this scenario, the deadlock happens when disk IO failure occurs inside
> StartupXLOG(). If we attach debugger to PG process, we will see the
> process is stuck acquiring the buffer’s lw-lock in AbortBufferIO().
>
>
>
> void
>
> AbortBufferIO(void)
>
> {
>
> BufferDesc *buf = InProgressBuf;
>
>
>
> if (buf)
>
> {
>
> uint32 buf_state;
>
>
>
> /*
>
> * Since LWLockReleaseAll has already been called, we're not
> holding
>
> * the buffer's io_in_progress_lock. We have to re-acquire it so
> that
>
> * we can use TerminateBufferIO. Anyone who's executing WaitIO on
> the
>
> * buffer will be in a busy spin until we succeed in doing this.
>
> */
>
> LWLockAcquire(BufferDescriptorGetIOLock(buf), LW_EXCLUSIVE);
>
>
>
> This is because the same lock has been acquired before buffer manager
> attempts to flush the buffer page, which happens in StartBufferIO().
>
>
>
> static bool
>
> StartBufferIO(BufferDesc *buf, bool forInput)
>
> {
>
> uint32 buf_state;
>
>
>
> Assert(!InProgressBuf);
>
>
>
> for (;;)
>
> {
>
> /*
>
> * Grab the io_in_progress lock so that other processes can wait
> for
>
> * me to finish the I/O.
>
> */
>
> LWLockAcquire(BufferDescriptorGetIOLock(buf), LW_EXCLUSIVE);
>
>
>
> buf_state = LockBufHdr(buf);
>
>
>
> if (!(buf_state & BM_IO_IN_PROGRESS))
>
> break;
>
>
>
> After reading the code, AtProcExit_Buffers() assumes all the lw-locks are
> released. However, in single user mode, at the time StartupXLOG() is
> being executed, there is no before_shmem_exit/on_shmem_exit callback
> registered to release the lw-locks.
>
> And, given lw-lock is non-reentrant, so the process gets stuck
> re-acquiring the same lock.
>
>
>
> Here is the call stack:
>
>
>
> (gdb) bt
>
> #0 0x00007f0fdb7cb6d6 in futex_abstimed_wait_cancelable (private=128,
> abstime=0x0, expected=0, futex_word=0x7f0fd14c81b8) at
> ../sysdeps/unix/sysv/linux/futex-internal.h:205
>
> #1 do_futex_wait (sem=sem(at)entry=0x7f0fd14c81b8, abstime=0x0) at
> sem_waitcommon.c:111
>
> #2 0x00007f0fdb7cb7c8 in __new_sem_wait_slow (sem=0x7f0fd14c81b8,
> abstime=0x0) at sem_waitcommon.c:181
>
> #3 0x00005630d475658a in PGSemaphoreLock (sema=0x7f0fd14c81b8) at
> pg_sema.c:316
>
> #4 0x00005630d47f689e in LWLockAcquire (lock=0x7f0fd9ae9c00,
> mode=LW_EXCLUSIVE) at
> /path/to/postgres/source/build/../src/backend/storage/lmgr/lwlock.c:1243
>
> #5 0x00005630d47cd087 in *AbortBufferIO* () at
> /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:3988
>
> #6 0x00005630d47cb3f9 in AtProcExit_Buffers (code=1, arg=0) at
> /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:2473
>
> #7 0x00005630d47dbc32 in shmem_exit (code=1) at
> /path/to/postgres/source/build/../src/backend/storage/ipc/ipc.c:272
>
> #8 0x00005630d47dba5e in proc_exit_prepare (code=1) at
> /path/to/postgres/source/build/../src/backend/storage/ipc/ipc.c:194
>
> #9 0x00005630d47db9c6 in proc_exit (code=1) at
> /path/to/postgres/source/build/../src/backend/storage/ipc/ipc.c:107
>
> #10 0x00005630d49811bc in *errfinish* (dummy=0) at
> /path/to/postgres/source/build/../src/backend/utils/error/elog.c:541
>
> #11 0x00005630d4801f1f in mdwrite (reln=0x5630d6588c68,
> forknum=MAIN_FORKNUM, blocknum=8, buffer=0x7f0fd1ae9c00 "",
> skipFsync=false) at
> /path/to/postgres/source/build/../src/backend/storage/smgr/md.c:843
>
> #12 0x00005630d4804716 in smgrwrite (reln=0x5630d6588c68,
> forknum=MAIN_FORKNUM, blocknum=8, buffer=0x7f0fd1ae9c00 "",
> skipFsync=false) at
> /path/to/postgres/source/build/../src/backend/storage/smgr/smgr.c:650
>
> #13 0x00005630d47cb824 in FlushBuffer (buf=0x7f0fd19e9c00,
> reln=0x5630d6588c68) at
> /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:2751
>
> #14 0x00005630d47cb219 in SyncOneBuffer (buf_id=0,
> skip_recently_used=false, wb_context=0x7ffccc371a70) at
> /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:2394
>
> #15 0x00005630d47cab00 in BufferSync (flags=6) at
> /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:1984
>
> #16 0x00005630d47cb57f in CheckPointBuffers (flags=6) at
> /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:2578
>
> #17 0x00005630d44a685b in CheckPointGuts (checkPointRedo=23612304,
> flags=6) at
> /path/to/postgres/source/build/../src/backend/access/transam/xlog.c:9149
>
> #18 0x00005630d44a62cf in CreateCheckPoint (flags=6) at
> /path/to/postgres/source/build/../src/backend/access/transam/xlog.c:8937
>
> #19 0x00005630d44a45e3 in *StartupXLOG* () at
> /path/to/postgres/source/build/../src/backend/access/transam/xlog.c:7723
>
> #20 0x00005630d4995f88 in InitPostgres (in_dbname=0x5630d65582b0
> "postgres", dboid=0, username=0x5630d653d7d0 "chengyu", useroid=0,
> out_dbname=0x0, override_allow_connections=false)
>
> at
> /path/to/postgres/source/build/../src/backend/utils/init/postinit.c:636
>
> #21 0x00005630d480b68b in PostgresMain (argc=7, argv=0x5630d6534d20,
> dbname=0x5630d65582b0 "postgres", username=0x5630d653d7d0 "chengyu") at
> /path/to/postgres/source/build/../src/backend/tcop/postgres.c:3810
>
> #22 0x00005630d4695e8b in main (argc=7, argv=0x5630d6534d20) at
> /path/to/postgres/source/build/../src/backend/main/main.c:224
>
>
>
> (gdb) p on_shmem_exit_list
>
> $1 = {{function = 0x55801cc68f5f <AnonymousShmemDetach>, arg = 0},
>
> {function = 0x55801cc68a4f <IpcMemoryDelete>, arg = 2490396},
>
> {function = 0x55801cc689f8 <IpcMemoryDetach>, arg = 140602018975744},
>
> {function = 0x55801cc6842e <ReleaseSemaphores>, arg = 0},
>
> {function = 0x55801ccec48a <dsm_postmaster_shutdown>, arg =
> 140602018975744},
>
> {function = 0x55801cd04053 <ProcKill>, arg = 0},
>
> {function = 0x55801cd0402d <RemoveProcFromArray>, arg = 0},
>
> {function = 0x55801ccf74e8 <CleanupInvalidationState>, arg =
> 140601991817088},
>
> {function = 0x55801ccf446f <CleanupProcSignalState>, arg = 1},
>
> {function = 0x55801ccdd3e5 <AtProcExit_Buffers>, arg = 0}, {function =
> 0x0, arg = 0}, {function = 0x0, arg = 0}, {function = 0x0, arg = 0},
> {function = 0x0, arg = 0}, {function = 0x0, arg = 0}, {function = 0x0, arg
> = 0}, {function = 0x0, arg = 0}, {function = 0x0, arg = 0}, {function =
> 0x0, arg = 0}, {function = 0x0, arg = 0}}
>
> (gdb) p before_shmem_exit_list
>
> $2 = {{function = 0x0, arg = 0} <repeats 20 times>}
>
>
>
> The second type is in Step #4. At the time when “checkpoint” SQL command
> is being executed, PG has already set up the before_shmem_exit callback
> ShutdownPostgres(), which releases all lw-locks given transaction or
> sub-transaction is on-going. So after the first IO error, the buffer page’s
> lw-lock gets released successfully. However, later ShutdownXLOG() is
> invoked, and PG tries to flush buffer pages again, which results in the
> second IO error. Different from the first time, this time, all the previous
> executed before/on_shmem_exit callbacks are not invoked again due to the
> decrease of the indexes. So lw-locks for buffer pages are not released when
> PG tries to get the same buffer lock in AbortBufferIO(), and then PG
> process gets stuck.
>
>
>
> Here is the call stack:
>
>
>
> (gdb) bt
>
> #0 0x00007ff0c0c036d6 in futex_abstimed_wait_cancelable (private=128,
> abstime=0x0, expected=0, futex_word=0x7ff0b69001b8) at
> ../sysdeps/unix/sysv/linux/futex-internal.h:205
>
> #1 do_futex_wait (sem=sem(at)entry=0x7ff0b69001b8, abstime=0x0) at
> sem_waitcommon.c:111
>
> #2 0x00007ff0c0c037c8 in __new_sem_wait_slow (sem=0x7ff0b69001b8,
> abstime=0x0) at sem_waitcommon.c:181
>
> #3 0x0000562077cc258a in PGSemaphoreLock (sema=0x7ff0b69001b8) at
> pg_sema.c:316
>
> #4 0x0000562077d6289e in LWLockAcquire (lock=0x7ff0bef225c0,
> mode=LW_EXCLUSIVE) at
> /path/to/postgres/source/build/../src/backend/storage/lmgr/lwlock.c:1243
>
> #5 0x0000562077d39087 in *AbortBufferIO* () at
> /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:3988
>
> #6 0x0000562077d373f9 in AtProcExit_Buffers (code=1, arg=0) at
> /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:2473
>
> #7 0x0000562077d47c32 in shmem_exit (code=1) at
> /path/to/postgres/source/build/../src/backend/storage/ipc/ipc.c:272
>
> #8 0x0000562077d47a5e in proc_exit_prepare (code=1) at
> /path/to/postgres/source/build/../src/backend/storage/ipc/ipc.c:194
>
> #9 0x0000562077d479c6 in proc_exit (code=1) at
> /path/to/postgres/source/build/../src/backend/storage/ipc/ipc.c:107
>
> #10 0x0000562077eed1bc in *errfinish* (dummy=0) at
> /path/to/postgres/source/build/../src/backend/utils/error/elog.c:541
>
> #11 0x0000562077d6df1f in mdwrite (reln=0x562078a12a18,
> forknum=MAIN_FORKNUM, blocknum=8, buffer=0x7ff0b6fbdc00 "",
> skipFsync=false) at
> /path/to/postgres/source/build/../src/backend/storage/smgr/md.c:843
>
> #12 0x0000562077d70716 in smgrwrite (reln=0x562078a12a18,
> forknum=MAIN_FORKNUM, blocknum=8, buffer=0x7ff0b6fbdc00 "",
> skipFsync=false) at
> /path/to/postgres/source/build/../src/backend/storage/smgr/smgr.c:650
>
> #13 0x0000562077d37824 in FlushBuffer (buf=0x7ff0b6e22f80,
> reln=0x562078a12a18) at
> /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:2751
>
> #14 0x0000562077d37219 in SyncOneBuffer (buf_id=78,
> skip_recently_used=false, wb_context=0x7fffb0e3e230) at
> /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:2394
>
> #15 0x0000562077d36b00 in BufferSync (flags=5) at
> /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:1984
>
> #16 0x0000562077d3757f in CheckPointBuffers (flags=5) at
> /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:2578
>
> #17 0x0000562077a1285b in CheckPointGuts (checkPointRedo=24049152,
> flags=5) at
> /path/to/postgres/source/build/../src/backend/access/transam/xlog.c:9149
>
> #18 0x0000562077a122cf in CreateCheckPoint (flags=5) at
> /path/to/postgres/source/build/../src/backend/access/transam/xlog.c:8937
>
> #19 0x0000562077a1164f in *ShutdownXLOG* (code=1, arg=0) at
> /path/to/postgres/source/build/../src/backend/access/transam/xlog.c:8485
>
> #20 0x0000562077d47c32 in shmem_exit (code=1) at
> /path/to/postgres/source/build/../src/backend/storage/ipc/ipc.c:272
>
> #21 0x0000562077d47a5e in proc_exit_prepare (code=1) at
> /path/to/postgres/source/build/../src/backend/storage/ipc/ipc.c:194
>
> #22 0x0000562077d479c6 in proc_exit (code=1) at
> /path/to/postgres/source/build/../src/backend/storage/ipc/ipc.c:107
>
> #23 0x0000562077eed1bc in *errfinish* (dummy=0) at
> /path/to/postgres/source/build/../src/backend/utils/error/elog.c:541
>
> #24 0x0000562077d6df1f in mdwrite (reln=0x562078a12a18,
> forknum=MAIN_FORKNUM, blocknum=8, buffer=0x7ff0b6fbdc00 "",
> skipFsync=false) at
> /path/to/postgres/source/build/../src/backend/storage/smgr/md.c:843
>
> #25 0x0000562077d70716 in smgrwrite (reln=0x562078a12a18,
> forknum=MAIN_FORKNUM, blocknum=8, buffer=0x7ff0b6fbdc00 "",
> skipFsync=false) at
> /path/to/postgres/source/build/../src/backend/storage/smgr/smgr.c:650
>
> #26 0x0000562077d37824 in FlushBuffer (buf=0x7ff0b6e22f80,
> reln=0x562078a12a18) at
> /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:2751
>
> #27 0x0000562077d37219 in SyncOneBuffer (buf_id=78,
> skip_recently_used=false, wb_context=0x7fffb0e3fb10) at
> /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:2394
>
> #28 0x0000562077d36b00 in BufferSync (flags=44) at
> /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:1984
>
> #29 0x0000562077d3757f in CheckPointBuffers (flags=44) at
> /path/to/postgres/source/build/../src/backend/storage/buffer/bufmgr.c:2578
>
> #30 0x0000562077a1285b in CheckPointGuts (checkPointRedo=24049152,
> flags=44) at
> /path/to/postgres/source/build/../src/backend/access/transam/xlog.c:9149
>
> #31 0x0000562077a122cf in CreateCheckPoint (flags=44) at
> /path/to/postgres/source/build/../src/backend/access/transam/xlog.c:8937
>
> #32 0x0000562077cca792 in RequestCheckpoint (flags=44) at
> /path/to/postgres/source/build/../src/backend/postmaster/checkpointer.c:976
>
> #33 0x0000562077d7bce4 in standard_ProcessUtility (pstmt=0x562078a00b50,
> queryString=0x562078a00100 "checkpoint;\n",
> context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
>
> dest=0x5620783ac5e0 <debugtupDR>, completionTag=0x7fffb0e41520 "") at
> /path/to/postgres/source/build/../src/backend/tcop/utility.c:769
>
> #34 0x0000562077d7b204 in ProcessUtility (pstmt=0x562078a00b50,
> queryString=0x562078a00100 "checkpoint;\n",
> context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
> dest=0x5620783ac5e0 <debugtupDR>,
>
> completionTag=0x7fffb0e41520 "") at
> /path/to/postgres/source/build/../src/backend/tcop/utility.c:360
>
> #35 0x0000562077d7a347 in PortalRunUtility (portal=0x5620789f20c0,
> pstmt=0x562078a00b50, isTopLevel=true, setHoldSnapshot=false,
> dest=0x5620783ac5e0 <debugtupDR>, completionTag=0x7fffb0e41520 "")
>
> at /path/to/postgres/source/build/../src/backend/tcop/pquery.c:1178
>
> #36 0x0000562077d7a534 in PortalRunMulti (portal=0x5620789f20c0,
> isTopLevel=true, setHoldSnapshot=false, dest=0x5620783ac5e0 <debugtupDR>,
> altdest=0x5620783ac5e0 <debugtupDR>,
>
> completionTag=0x7fffb0e41520 "") at
> /path/to/postgres/source/build/../src/backend/tcop/pquery.c:1324
>
> #37 0x0000562077d79a61 in PortalRun (portal=0x5620789f20c0,
> count=9223372036854775807, isTopLevel=true, run_once=true,
> dest=0x5620783ac5e0 <debugtupDR>, altdest=0x5620783ac5e0 <debugtupDR>,
>
> completionTag=0x7fffb0e41520 "") at
> /path/to/postgres/source/build/../src/backend/tcop/pquery.c:799
>
> #38 0x0000562077d734c5 in exec_simple_query (query_string=0x562078a00100
> "checkpoint;\n") at
> /path/to/postgres/source/build/../src/backend/tcop/postgres.c:1145
>
> #39 0x0000562077d77bd5 in PostgresMain (argc=7, argv=0x562078980d20,
> dbname=0x5620789a42b0 "postgres", username=0x5620789897d0 "chengyu") at
> /path/to/postgres/source/build/../src/backend/tcop/postgres.c:4182
>
> #40 0x0000562077c01e8b in main (argc=7, argv=0x562078980d20) at
> /path/to/postgres/source/build/../src/backend/main/main.c:224
>
>
>
> (gdb) p on_shmem_exit_list
>
> $9 = {{function = 0x562077cc2f5f <AnonymousShmemDetach>, arg = 0},
>
> {function = 0x562077cc2a4f <IpcMemoryDelete>, arg = 2457627},
>
> {function = 0x562077cc29f8 <IpcMemoryDetach>, arg = 140672005165056},
>
> {function = 0x562077cc242e <ReleaseSemaphores>, arg = 0},
>
> {function = 0x562077d4648a <dsm_postmaster_shutdown>, arg =
> 140672005165056},
>
> {function = 0x562077d5e053 <ProcKill>, arg = 0},
>
> {function = 0x562077d5e02d <RemoveProcFromArray>, arg = 0},
>
> {function = 0x562077d514e8 <CleanupInvalidationState>, arg =
> 140671978006400},
>
> {function = 0x562077d4e46f <CleanupProcSignalState>, arg = 1},
>
> {function = 0x562077d373e5 <AtProcExit_Buffers>, arg = 0},
>
> {function = 0x562077a1159d <ShutdownXLOG>, arg = 0},
>
> {function = 0x562077cd0637 <pgstat_beshutdown_hook>, arg = 0}, {function =
> 0x0, arg = 0}, {function = 0x0, arg = 0}, {function = 0x0, arg = 0},
> {function = 0x0, arg = 0}, {function = 0x0, arg = 0}, {function = 0x0, arg
> = 0}, {function = 0x0, arg = 0}, {function = 0x0, arg = 0}}
>
> (gdb) p before_shmem_exit_list
>
> $10 = {{function = 0x562077f02caa <ShutdownPostgres>, arg = 0}, {function
> = 0x0, arg = 0} <repeats 19 times>}
>
>
>
> OK, now we understand the deadlock issue for single user mode. However,
> will this issue affect multi-user mode (i.e. under postmaster process)? We
> can have 3 cases for discussion:
>
>
>
> 1. Startup process: at the time StartupXLOG() is invoked,
> ShutdownAuxiliaryProcess(), which will release all the lw-locks, has
> been already registered in before_shmem_exit_list[]. So this case is
> safe.
> 2. Checkpointer process: ShutdownXLOG() is not registered as a
> before/on_shmem_exit callback, instead, it’s only invoked in the main loop.
> So there is no chance to hit IO error for second time during shared memory
> exit callbacks. Also, Same as startup process,
> ShutdownAuxiliaryProcess() has been registered. So this case is also
> safe.
> 3. Other backend/background processes: these processes do not handle
> XLOG startup or shutdown, and are protected by
> ShutdownAuxiliaryProcess(). So they are safe to exit too.
>
>
>
> In addition, we have done multiple experiments to confirm these cases.
>
>
>
> Affected versions: we found this issue in 9.5, 9.6, 10, 11 and 12devel.
>
>
>
> Fix proposal:
>
>
>
> According to the affected 2 types of deadlock in single user mode
> discussed above, there might be multiple ways to fix this issue. In the fix
> proposal we would like to present, we register a new callback to release
> all the lw-locks (just like what ShutdownAuxiliaryProcess()does) in an
> order after AtProcExit_Buffers() and before ShutdownXLOG(). Also, it is
> registered before PG enters StartupXLOG(), so it can cover the case when
> ShutdownPostgres() has not been registered. Here is the content of the
> proposal:
>
>
>
> diff --git a/src/backend/utils/init/postinit.c
> b/src/backend/utils/init/postinit.c
>
> index 62baaf0ab3..d74e8aa1d5 100644
>
> --- a/src/backend/utils/init/postinit.c
>
> +++ b/src/backend/utils/init/postinit.c
>
> @@ -71,6 +71,7 @@ static HeapTuple GetDatabaseTupleByOid(Oid dboid);
>
> static void PerformAuthentication(Port *port);
>
> static void CheckMyDatabase(const char *name, bool am_superuser, bool
> override_allow_connections);
>
> static void InitCommunication(void);
>
> +static void ReleaseLWLocks(int code, Datum arg);
>
> static void ShutdownPostgres(int code, Datum arg);
>
> static void StatementTimeoutHandler(void);
>
> static void LockTimeoutHandler(void);
>
> @@ -653,6 +654,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const
> char *username,
>
> * way, start up the XLOG machinery, and register to have
> it closed
>
> * down at exit.
>
> */
>
> + on_shmem_exit(ReleaseLWLocks, 0);
>
> StartupXLOG();
>
> on_shmem_exit(ShutdownXLOG, 0);
>
> }
>
> @@ -1214,6 +1216,23 @@ process_settings(Oid databaseid, Oid roleid)
>
> heap_close(relsetting, AccessShareLock);
>
> }
>
> +/*
>
> + * There are 2 types of buffer locks on-holding when AtProcExit_Buffers()
> is
>
> + * invoked in a bootstrap process or a standalone backend:
>
> + * (1) Exceptions thrown during StartupXLOG()
>
> + * (2) Exceptions thrown during exception-handling in ShutdownXLOG()
>
> + * So we need this on_shmem_exit callback for single user mode.
>
> + * For processes under postmaster, ShutdownAuxiliaryProcess() will release
>
> + * the lw-locks and ShutdownXLOG() is not registered as a callback, so
> there
>
> + * is no such issue. Also, please note this callback should be registered
> in
>
> + * the order after AtProcExit_buffers() and before ShutdownXLOG().
>
> + */
>
> +static void
>
> +ReleaseLWLocks(int code, Datum arg)
>
> +{
>
> + LWLockReleaseAll();
>
> +}
>
> +
>
> /*
>
> * Backend-shutdown callback. Do cleanup that we want to be sure happens
>
> * before all the supporting modules begin to nail their doors shut via
>
>
>
> The fix proposal is also attached to this email in file
> “fix-deadlock.patch”.
>
>
>
> Please let us know should you have suggestions on this issue and the fix.
>
>
>
> Thank you!
>
>
>
> Best regards,
>
> --
>
> Chengchao Yu
>
> Software Engineer | Microsoft | Azure Database for PostgreSQL
>
> https://azure.microsoft.com/en-us/services/postgresql/
>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-12-01 18:30:10 Re: Bug fix for glibc broke freebsd build in REL_11_STABLE
Previous Message Jaime Casanova 2018-12-01 17:44:08 Re: Commitfest 2018-11