A proposal to provide a timeout option for CREATE_REPLICATION_SLOT/pg_create_logical_replication_slot

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: chanderprabhjain95(at)gmail(dot)com
Subject: A proposal to provide a timeout option for CREATE_REPLICATION_SLOT/pg_create_logical_replication_slot
Date: 2022-06-09 04:55:06
Message-ID: CALj2ACVaGbYC=mj9K83zFp7RPZX6T3j5hu9u2FO9zGx_vGkb7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Currently CREATE_REPLICATION_SLOT/pg_create_logical_replication_slot waits
unboundedly if there are any in-progress write transactions [1]. The wait
is for a reason actually i.e. for building an initial snapshot, but waiting
unboundedly isn't good for usability of the command/function and when
stuck, the callers will not have any information as to why.

How about we provide a timeout for the command/function instead of letting
them wait unboundedly? The behavior will be something like this - if the
logical replication slot isn't created within this timeout, the
command/function will fail.

We could've asked callers to set statement_timeout before calling
CREATE_REPLICATION_SLOT/pg_create_logical_replication_slot but that impacts
the queries running in all other sessions and it may not be always possible
to set this parameter just for the session that runs command
CREATE_REPLICATION_SLOT.

Thoughts?

[1]
(gdb) bt
#0 0x00007fc21509a45a in epoll_wait (epfd=9, events=0x561874204e88,
maxevents=1, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1 0x000056187350e9cc in WaitEventSetWaitBlock (set=0x561874204e28,
cur_timeout=-1, occurred_events=0x7fff72b3a4a0, nevents=1) at latch.c:1467
#2 0x000056187350e847 in WaitEventSetWait (set=0x561874204e28, timeout=-1,
occurred_events=0x7fff72b3a4a0, nevents=1, wait_event_info=50331653) at
latch.c:1413
#3 0x000056187350db64 in WaitLatch (latch=0x7fc21292f324, wakeEvents=33,
timeout=0, wait_event_info=50331653) at latch.c:475
#4 0x000056187353b5b2 in ProcSleep (locallock=0x56187422aa58,
lockMethodTable=0x561873a61a20 <default_lockmethod>) at proc.c:1337
#5 0x0000561873527e49 in WaitOnLock (locallock=0x56187422aa58,
owner=0x5618742888b0) at lock.c:1859
#6 0x0000561873526730 in LockAcquireExtended (locktag=0x7fff72b3a8a0,
lockmode=5, sessionLock=false, dontWait=false, reportMemoryError=true,
locallockp=0x0) at lock.c:1101
#7 0x0000561873525b9d in LockAcquire (locktag=0x7fff72b3a8a0, lockmode=5,
sessionLock=false, dontWait=false) at lock.c:752
#8 0x0000561873524099 in XactLockTableWait (xid=734, rel=0x0, ctid=0x0,
oper=XLTW_None) at lmgr.c:702
#9 0x00005618734a69c4 in SnapBuildWaitSnapshot (running=0x561874315a18,
cutoff=735) at snapbuild.c:1416
#10 0x00005618734a67a2 in SnapBuildFindSnapshot (builder=0x561874311a80,
lsn=21941704, running=0x561874315a18) at snapbuild.c:1328
#11 0x00005618734a62c4 in SnapBuildProcessRunningXacts
(builder=0x561874311a80, lsn=21941704, running=0x561874315a18) at
snapbuild.c:1117
#12 0x000056187348cab0 in standby_decode (ctx=0x5618742fb9e0,
buf=0x7fff72b3aa00) at decode.c:346
#13 0x000056187348c34e in LogicalDecodingProcessRecord (ctx=0x5618742fb9e0,
record=0x5618742fbda0) at decode.c:119
#14 0x000056187349124e in DecodingContextFindStartpoint
(ctx=0x5618742fb9e0) at logical.c:613
#15 0x00005618734c2ab3 in create_logical_replication_slot
(name=0x56187420d848 "slot1", plugin=0x56187420d8f8 "test_decoding",
temporary=false, two_phase=false, restart_lsn=0, find_startpoint=true) at
slotfuncs.c:158
#16 0x00005618734c2bb8 in pg_create_logical_replication_slot
(fcinfo=0x5618742efdd0) at slotfuncs.c:187
#17 0x00005618732def6b in ExecMakeTableFunctionResult
(setexpr=0x5618742dc318, econtext=0x5618742dc1d0,
argContext=0x5618742efcb0, expectedDesc=0x5618742ec098, randomAccess=false)
at execSRF.c:234
#18 0x00005618732fbc27 in FunctionNext (node=0x5618742dbfb8) at
nodeFunctionscan.c:95
#19 0x00005618732e0987 in ExecScanFetch (node=0x5618742dbfb8,
accessMtd=0x5618732fbb72 <FunctionNext>, recheckMtd=0x5618732fbf6e
<FunctionRecheck>) at execScan.c:133
#20 0x00005618732e0a00 in ExecScan (node=0x5618742dbfb8,
accessMtd=0x5618732fbb72 <FunctionNext>, recheckMtd=0x5618732fbf6e
<FunctionRecheck>) at execScan.c:182
#21 0x00005618732fbfc4 in ExecFunctionScan (pstate=0x5618742dbfb8) at
nodeFunctionscan.c:270
#22 0x00005618732dc693 in ExecProcNodeFirst (node=0x5618742dbfb8) at
execProcnode.c:463
#23 0x00005618732cfe80 in ExecProcNode (node=0x5618742dbfb8) at
../../../src/include/executor/executor.h:259

Regards,
Bharath Rupireddy.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2022-06-09 04:55:45 Re: Bump MIN_WINNT to 0x0600 (Vista) as minimal runtime in 16~
Previous Message Michael Paquier 2022-06-09 03:55:34 Re: Bump MIN_WINNT to 0x0600 (Vista) as minimal runtime in 16~