Re: pgsql: Compute XID horizon for page level index vacuum on primary.

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-committers(at)lists(dot)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: pgsql: Compute XID horizon for page level index vacuum on primary.
Date: 2019-03-28 04:34:52
Message-ID: CA+hUKGLCwPF0S4Mk7S8qw+DK0Bq65LueN9rofAA3HHSYikW-Zw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Wed, Mar 27, 2019 at 1:06 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> Compute XID horizon for page level index vacuum on primary.

Hi Andres,

I have a virtual machine running FreeBSD 12.0 on i386 on which
contrib/test_decoding consistently self-deadlocks in the "rewrite"
test, with the stack listed below. You can see that we wait for a
share lock that we already hold exclusively. Peter Geoghegan spotted
the problem: this code path shouldn't access syscache, or at least not
for a catalog table. He suggested something along these lines:

--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -6977,7 +6977,10 @@ heap_compute_xid_horizon_for_tuples(Relation rel,
* simplistic, but at the moment there is no evidence of that
or any idea
* about what would work better.
*/
- io_concurrency =
get_tablespace_io_concurrency(rel->rd_rel->reltablespace);
+ if (IsCatalogRelation(rel))
+ io_concurrency = 1;
+ else
+ io_concurrency =
get_tablespace_io_concurrency(rel->rd_rel->reltablespace);
prefetch_distance = Min((io_concurrency) + 10, MAX_IO_CONCURRENCY);

/* Start prefetching. */

Indeed that seems to fix the problem for me.

* frame #0: 0x28c04ca1 libc.so.7`__sys__umtx_op + 5
frame #1: 0x28bed0ab libc.so.7`sem_clockwait_np + 283
frame #2: 0x28bed1ae libc.so.7`sem_wait + 62
frame #3: 0x0858d837 postgres`PGSemaphoreLock(sema=0x290141a8) at
pg_sema.c:316
frame #4: 0x08678b94 postgres`LWLockAcquire(lock=0x295365a4,
mode=LW_SHARED) at lwlock.c:1244
frame #5: 0x08639d8f postgres`LockBuffer(buffer=129, mode=1) at
bufmgr.c:3565
frame #6: 0x08187f7d postgres`_bt_getbuf(rel=0x31c3e95c, blkno=1,
access=1) at nbtpage.c:806
frame #7: 0x081887f9 postgres`_bt_getroot(rel=0x31c3e95c,
access=1) at nbtpage.c:323
frame #8: 0x081932aa postgres`_bt_search(rel=0x31c3e95c,
key=0xffbfad00, bufP=0xffbfb31c, access=1, snapshot=0x08b27c58) at
nbtsearch.c:99
frame #9: 0x08195bfc postgres`_bt_first(scan=0x31db73a0,
dir=ForwardScanDirection) at nbtsearch.c:1246
frame #10: 0x08190f96 postgres`btgettuple(scan=0x31db73a0,
dir=ForwardScanDirection) at nbtree.c:245
frame #11: 0x0817d3fa postgres`index_getnext_tid(scan=0x31db73a0,
direction=ForwardScanDirection) at indexam.c:550
frame #12: 0x0817d6a8 postgres`index_getnext_slot(scan=0x31db73a0,
direction=ForwardScanDirection, slot=0x31df2320) at indexam.c:642
frame #13: 0x0817b4c9
postgres`systable_getnext(sysscan=0x31df242c) at genam.c:450
frame #14: 0x0887e3a3 postgres`ScanPgRelation(targetRelId=1213,
indexOK=true, force_non_historic=false) at relcache.c:365
frame #15: 0x088742e1 postgres`RelationBuildDesc(targetRelId=1213,
insertIt=true) at relcache.c:1055
frame #16: 0x0887356a
postgres`RelationIdGetRelation(relationId=1213) at relcache.c:2030
frame #17: 0x080d7ac5 postgres`relation_open(relationId=1213,
lockmode=1) at relation.c:59
frame #18: 0x081cc2b6 postgres`table_open(relationId=1213,
lockmode=1) at table.c:43
frame #19: 0x0886597b
postgres`SearchCatCacheMiss(cache=0x31c2b200, nkeys=1,
hashValue=1761185739, hashIndex=3, v1=1663, v2=0, v3=0, v4=0) at
catcache.c:1357
frame #20: 0x088622db
postgres`SearchCatCacheInternal(cache=0x31c2b200, nkeys=1, v1=1663,
v2=0, v3=0, v4=0) at catcache.c:1299
frame #21: 0x08862354 postgres`SearchCatCache1(cache=0x31c2b200,
v1=1663) at catcache.c:1167
frame #22: 0x0888406a postgres`SearchSysCache1(cacheId=61,
key1=1663) at syscache.c:1119
frame #23: 0x088834de postgres`get_tablespace(spcid=1663) at spccache.c:136
frame #24: 0x08883617
postgres`get_tablespace_io_concurrency(spcid=0) at spccache.c:217
frame #25: 0x08155a82
postgres`heap_compute_xid_horizon_for_tuples(rel=0x31cbee40,
tids=0x31df146c, nitems=3) at heapam.c:6980
frame #26: 0x0817b09d
postgres`table_compute_xid_horizon_for_tuples(rel=0x31cbee40,
items=0x31df146c, nitems=3) at tableam.h:708
frame #27: 0x0817b03a
postgres`index_compute_xid_horizon_for_tuples(irel=0x31c3e95c,
hrel=0x31cbee40, ibuf=129, itemnos=0xffbfbb8c, nitems=3) at
genam.c:306
frame #28: 0x0818ae92 postgres`_bt_delitems_delete(rel=0x31c3e95c,
buf=129, itemnos=0xffbfbb8c, nitems=3, heapRel=0x31cbee40) at
nbtpage.c:1111
frame #29: 0x0818405b postgres`_bt_vacuum_one_page(rel=0x31c3e95c,
buffer=129, heapRel=0x31cbee40) at nbtinsert.c:2270
frame #30: 0x08180a4f postgres`_bt_findinsertloc(rel=0x31c3e95c,
insertstate=0xffbfcce0, checkingunique=true, stack=0x00000000,
heapRel=0x31cbee40) at nbtinsert.c:736
frame #31: 0x0817f40c postgres`_bt_doinsert(rel=0x31c3e95c,
itup=0x31db69f4, checkUnique=UNIQUE_CHECK_YES, heapRel=0x31cbee40) at
nbtinsert.c:281
frame #32: 0x08190416 postgres`btinsert(rel=0x31c3e95c,
values=0xffbfce54, isnull=0xffbfce34, ht_ctid=0x31db42e4,
heapRel=0x31cbee40, checkUnique=UNIQUE_CHECK_YES,
indexInfo=0x31db67dc) at nbtree.c:203
frame #33: 0x0817c173
postgres`index_insert(indexRelation=0x31c3e95c, values=0xffbfce54,
isnull=0xffbfce34, heap_t_ctid=0x31db42e4, heapRelation=0x31cbee40,
checkUnique=UNIQUE_CHECK_YES, indexInfo=0x31db67dc) at indexam.c:212
frame #34: 0x0823cca4
postgres`CatalogIndexInsert(indstate=0x31db9228, heapTuple=0x31db42e0)
at indexing.c:140
frame #35: 0x0823cd72
postgres`CatalogTupleUpdate(heapRel=0x31cbee40, otid=0x31db42e4,
tup=0x31db42e0) at indexing.c:215
frame #36: 0x088768ed
postgres`RelationSetNewRelfilenode(relation=0x31c2fb38,
persistence='p', freezeXid=0, minmulti=0) at relcache.c:3508
frame #37: 0x0823b5df postgres`reindex_index(indexId=2672,
skip_constraint_checks=true, persistence='p', options=0) at
index.c:3700
frame #38: 0x0823bf00 postgres`reindex_relation(relid=1262,
flags=18, options=0) at index.c:3946
frame #39: 0x08320063 postgres`finish_heap_swap(OIDOldHeap=1262,
OIDNewHeap=16580, is_system_catalog=true, swap_toast_by_content=true,
check_constraints=false, is_internal=true, frozenXid=673,
cutoffMulti=1, newrelpersistence='p') at cluster.c:1673
frame #40: 0x0831f5a3
postgres`rebuild_relation(OldHeap=0x31c2ff68, indexOid=0,
verbose=false) at cluster.c:629
frame #41: 0x0831eecd postgres`cluster_rel(tableOid=1262,
indexOid=0, options=0) at cluster.c:435
frame #42: 0x083f7c1d postgres`vacuum_rel(relid=1262,
relation=0x28b4b9dc, params=0xffbfd670) at vacuum.c:1743
frame #43: 0x083f6f87 postgres`vacuum(relations=0x31d6c1cc,
params=0xffbfd670, bstrategy=0x31d6c090, isTopLevel=true) at
vacuum.c:372
frame #44: 0x083f6837 postgres`ExecVacuum(pstate=0x31ca2c90,
vacstmt=0x28b4ba54, isTopLevel=true) at vacuum.c:175
frame #45: 0x0869f145
postgres`standard_ProcessUtility(pstmt=0x28b4bb18, queryString="VACUUM
FULL pg_database;", context=PROCESS_UTILITY_TOPLEVEL,
params=0x00000000, queryEnv=0x00000000, dest=0x28b4bc90,
completionTag="") at utility.c:670
frame #46: 0x0869e68e postgres`ProcessUtility(pstmt=0x28b4bb18,
queryString="VACUUM FULL pg_database;",
context=PROCESS_UTILITY_TOPLEVEL, params=0x00000000,
queryEnv=0x00000000, dest=0x28b4bc90, completionTag="") at
utility.c:360
frame #47: 0x0869ddfb postgres`PortalRunUtility(portal=0x31bee090,
pstmt=0x28b4bb18, isTopLevel=true, setHoldSnapshot=false,
dest=0x28b4bc90, completionTag="") at pquery.c:1175
frame #48: 0x0869ce02 postgres`PortalRunMulti(portal=0x31bee090,
isTopLevel=true, setHoldSnapshot=false, dest=0x28b4bc90,
altdest=0x28b4bc90, completionTag="") at pquery.c:1321
frame #49: 0x0869c363 postgres`PortalRun(portal=0x31bee090,
count=2147483647, isTopLevel=true, run_once=true, dest=0x28b4bc90,
altdest=0x28b4bc90, completionTag="") at pquery.c:796
frame #50: 0x08696e68
postgres`exec_simple_query(query_string="VACUUM FULL pg_database;") at
postgres.c:1215
frame #51: 0x08695eec postgres`PostgresMain(argc=1,
argv=0x31be6658, dbname="contrib_regression", username="munro") at
postgres.c:4247
frame #52: 0x085aedb0 postgres`BackendRun(port=0x31be1000) at
postmaster.c:4399
frame #53: 0x085adf9a postgres`BackendStartup(port=0x31be1000) at
postmaster.c:4090
frame #54: 0x085accd5 postgres`ServerLoop at postmaster.c:1703
frame #55: 0x085a9d95 postgres`PostmasterMain(argc=8,
argv=0xffbfe608) at postmaster.c:1376
frame #56: 0x0849ec92 postgres`main(argc=8, argv=0xffbfe608) at main.c:228
frame #57: 0x080bf5eb postgres`_start1(cleanup=0x28b1e540, argc=8,
argv=0xffbfe608) at crt1_c.c:73
frame #58: 0x080bf4b8 postgres`_start at crt1_s.S:49

(lldb) print num_held_lwlocks
(int) $0 = 1
(lldb) print held_lwlocks[0]
(LWLockHandle) $1 = {
lock = 0x295365a4
mode = LW_EXCLUSIVE
}

--
Thomas Munro
https://enterprisedb.com

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Thomas Munro 2019-03-28 05:27:01 pgsql: Add basic infrastructure for 64 bit transaction IDs.
Previous Message Andres Freund 2019-03-28 03:15:35 Re: pgsql: Add support for multivariate MCV lists

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2019-03-28 04:43:11 Re: Timeout parameters
Previous Message Amit Langote 2019-03-28 04:30:31 Re: partitioned tables referenced by FKs