Re: DROP DATABASE is interruptible

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Evgeny Morozov <postgresql3(at)realityexists(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: DROP DATABASE is interruptible
Date: 2024-03-12 08:00:00
Message-ID: 05980f33-0cbc-16c3-732e-e8468546c54b@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

13.07.2023 23:52, Andres Freund wrote:
>
> Backpatching indeed was no fun. Not having BackgroundPsql.pm was the worst
> part. But also a lot of other conflicts in tests... Took me 5-6 hours or
> so.
> But I now finally pushed the fixes. Hope the buildfarm agrees with it...
>
> Thanks for the review!

I've discovered that the test 037_invalid_database, introduced with
c66a7d75e, hangs when a server built with -DCLOBBER_CACHE_ALWAYS or with
debug_discard_caches = 1 set via TEMP_CONFIG:
echo "debug_discard_caches = 1" >/tmp/extra.config
TEMP_CONFIG=/tmp/extra.config make -s check -C src/test/recovery/ PROVE_TESTS="t/037*"
# +++ tap check in src/test/recovery +++
[09:05:48] t/037_invalid_database.pl .. 6/?

regress_log_037_invalid_database ends with:
[09:05:51.622](0.021s) # issuing query via background psql:
#   CREATE DATABASE regression_invalid_interrupt;
#   BEGIN;
#   LOCK pg_tablespace;
#   PREPARE TRANSACTION 'lock_tblspc';
[09:05:51.684](0.062s) ok 8 - blocked DROP DATABASE completion

I see two backends waiting:
law      2420132 2420108  0 09:05 ?        00:00:00 postgres: node: law postgres [local] DROP DATABASE waiting
law      2420135 2420108  0 09:05 ?        00:00:00 postgres: node: law postgres [local] startup waiting

and the latter's stack trace:
#0  0x00007f65c8fd3f9a in epoll_wait (epfd=9, events=0x563c40e15478, maxevents=1, timeout=-1) at
../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x0000563c3fa9a9fa in WaitEventSetWaitBlock (set=0x563c40e15410, cur_timeout=-1, occurred_events=0x7fff579dda80,
nevents=1) at latch.c:1570
#2  0x0000563c3fa9a8e4 in WaitEventSetWait (set=0x563c40e15410, timeout=-1, occurred_events=0x7fff579dda80, nevents=1,
wait_event_info=50331648) at latch.c:1516
#3  0x0000563c3fa99b14 in WaitLatch (latch=0x7f65c5e112e4, wakeEvents=33, timeout=0, wait_event_info=50331648) at
latch.c:538
#4  0x0000563c3fac7dee in ProcSleep (locallock=0x563c40e41e80, lockMethodTable=0x563c4007cba0 <default_lockmethod>) at
proc.c:1339
#5  0x0000563c3fab4160 in WaitOnLock (locallock=0x563c40e41e80, owner=0x563c40ea5af8) at lock.c:1816
#6  0x0000563c3fab2c80 in LockAcquireExtended (locktag=0x7fff579dde30, lockmode=1, sessionLock=false, dontWait=false,
reportMemoryError=true, locallockp=0x7fff579dde28) at lock.c:1080
#7  0x0000563c3faaf86d in LockRelationOid (relid=1213, lockmode=1) at lmgr.c:116
#8  0x0000563c3f537aff in relation_open (relationId=1213, lockmode=1) at relation.c:55
#9  0x0000563c3f5efde9 in table_open (relationId=1213, lockmode=1) at table.c:44
#10 0x0000563c3fca2227 in CatalogCacheInitializeCache (cache=0x563c40e8fe80) at catcache.c:980
#11 0x0000563c3fca255e in InitCatCachePhase2 (cache=0x563c40e8fe80, touch_index=true) at catcache.c:1083
#12 0x0000563c3fcc0556 in InitCatalogCachePhase2 () at syscache.c:184
#13 0x0000563c3fcb7db3 in RelationCacheInitializePhase3 () at relcache.c:4317
#14 0x0000563c3fce2748 in InitPostgres (in_dbname=0x563c40e54000 "postgres", dboid=5, username=0x563c40e53fe8 "law",
useroid=0, flags=1, out_dbname=0x0) at postinit.c:1177
#15 0x0000563c3fad90a7 in PostgresMain (dbname=0x563c40e54000 "postgres", username=0x563c40e53fe8 "law") at postgres.c:4229
#16 0x0000563c3f9f01e4 in BackendRun (port=0x563c40e45360) at postmaster.c:4475

It looks like no new backend can be started due to the pg_tablespace lock,
when a new relcache file is needed during the backend initialization.

Best regards,
Alexander

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-03-12 08:24:14 Re: Built-in CTYPE provider
Previous Message Bertrand Drouvot 2024-03-12 07:54:16 Re: Introduce XID age and inactive timeout based replication slot invalidation