Re: pg15b4: FailedAssertion("TransactionIdIsValid(xmax)

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Zhang Mingli <zmlpostgres(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: pg15b4: FailedAssertion("TransactionIdIsValid(xmax)
Date: 2022-09-12 01:42:35
Message-ID: 20220912014235.GC31833@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 12, 2022 at 10:44:38AM +1200, Thomas Munro wrote:
> On Sat, Sep 10, 2022 at 5:44 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > < 2022-09-09 19:37:25.835 CDT telsasoft >ERROR: MultiXactId 133553154 has not been created yet -- apparent wraparound
>
> I guess what happened here is that after one of your (apparently
> several?) OOM crashes, crash recovery didn't run all the way to the
> true end of the WAL due to the maintenance_io_concurrency=0 bug. In
> the case you reported, it couldn't complete an end-of-recovery
> checkpoint until you disabled recovery_prefetch, but that's only
> because of the somewhat unusual way that vismap pages work. In
> another case it might have been able to (bogusly) complete a
> checkpoint, leaving things in an inconsistent state.

I think you're saying is that this can be explained by the
io_concurrency bug in recovery_prefetch, if run under 15b3.

But yesterday I started from initdb and restored this cluster from backup, and
started up sqlsmith, and sent some kill -9, and now got more corruption.
Looks like it took ~10 induced crashes before this happened.

At the moment, I have no reason to believe this issue is related to
prefetch_recovery; I am wondering about changes to vacuum.

< 2022-09-11 20:19:03.071 CDT telsasoft >ERROR: MultiXactId 732646 has not been created yet -- apparent wraparound
< 2022-09-11 20:24:00.530 CDT telsasoft >ERROR: MultiXactId 732646 has not been created yet -- apparent wraparound

Program terminated with signal 6, Aborted.
#0 0x00007f413716b1f7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7_4.2.x86_64 libgcc-4.8.5-44.el7.x86_64 libxml2-2.9.1-6.el7_9.6.x86_64 lz4-1.8.3-1.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0 0x00007f413716b1f7 in raise () from /lib64/libc.so.6
#1 0x00007f413716c8e8 in abort () from /lib64/libc.so.6
#2 0x0000000000962c5c in ExceptionalCondition (conditionName=conditionName(at)entry=0x9ce238 "P_ISLEAF(opaque) && !P_ISDELETED(opaque)", errorType=errorType(at)entry=0x9bad97 "FailedAssertion",
fileName=fileName(at)entry=0x9cdcd1 "nbtpage.c", lineNumber=lineNumber(at)entry=1778) at assert.c:69
#3 0x0000000000507e34 in _bt_rightsib_halfdeadflag (rel=rel(at)entry=0x7f4138a238a8, leafrightsib=leafrightsib(at)entry=53) at nbtpage.c:1778
#4 0x0000000000507fba in _bt_mark_page_halfdead (rel=rel(at)entry=0x7f4138a238a8, leafbuf=leafbuf(at)entry=13637, stack=stack(at)entry=0x144ca20) at nbtpage.c:2121
#5 0x000000000050af1d in _bt_pagedel (rel=rel(at)entry=0x7f4138a238a8, leafbuf=leafbuf(at)entry=13637, vstate=vstate(at)entry=0x7ffef18de8b0) at nbtpage.c:2004
#6 0x000000000050c996 in btvacuumpage (vstate=vstate(at)entry=0x7ffef18de8b0, scanblkno=scanblkno(at)entry=36) at nbtree.c:1342
#7 0x000000000050caf8 in btvacuumscan (info=info(at)entry=0x7ffef18deac0, stats=stats(at)entry=0x142fb70, callback=callback(at)entry=0x67e89b <vac_tid_reaped>, callback_state=callback_state(at)entry=0x1461220, cycleid=<optimized out>)
at nbtree.c:997
#8 0x000000000050cc2f in btbulkdelete (info=0x7ffef18deac0, stats=0x142fb70, callback=0x67e89b <vac_tid_reaped>, callback_state=0x1461220) at nbtree.c:801
#9 0x00000000004fc64b in index_bulk_delete (info=info(at)entry=0x7ffef18deac0, istat=istat(at)entry=0x0, callback=callback(at)entry=0x67e89b <vac_tid_reaped>, callback_state=callback_state(at)entry=0x1461220) at indexam.c:701
#10 0x000000000068108c in vac_bulkdel_one_index (ivinfo=ivinfo(at)entry=0x7ffef18deac0, istat=istat(at)entry=0x0, dead_items=0x1461220) at vacuum.c:2324
#11 0x00000000004f72ae in lazy_vacuum_one_index (indrel=<optimized out>, istat=0x0, reltuples=<optimized out>, vacrel=vacrel(at)entry=0x142f100) at vacuumlazy.c:2726
#12 0x00000000004f738b in lazy_vacuum_all_indexes (vacrel=vacrel(at)entry=0x142f100) at vacuumlazy.c:2328
#13 0x00000000004f75df in lazy_vacuum (vacrel=vacrel(at)entry=0x142f100) at vacuumlazy.c:2261
#14 0x00000000004f7f14 in lazy_scan_heap (vacrel=vacrel(at)entry=0x142f100) at vacuumlazy.c:1264
#15 0x00000000004f895f in heap_vacuum_rel (rel=0x7f4138a67c00, params=0x143cbec, bstrategy=0x143ea20) at vacuumlazy.c:534
#16 0x000000000067f62b in table_relation_vacuum (bstrategy=<optimized out>, params=0x143cbec, rel=0x7f4138a67c00) at ../../../src/include/access/tableam.h:1680
#17 vacuum_rel (relid=1249, relation=<optimized out>, params=params(at)entry=0x143cbec) at vacuum.c:2086
#18 0x000000000068065c in vacuum (relations=0x144a118, params=params(at)entry=0x143cbec, bstrategy=<optimized out>, bstrategy(at)entry=0x143ea20, isTopLevel=isTopLevel(at)entry=true) at vacuum.c:475
#19 0x0000000000796a0e in autovacuum_do_vac_analyze (tab=tab(at)entry=0x143cbe8, bstrategy=bstrategy(at)entry=0x143ea20) at autovacuum.c:3149
#20 0x00000000007987bf in do_autovacuum () at autovacuum.c:2472
#21 0x0000000000798e72 in AutoVacWorkerMain (argc=argc(at)entry=0, argv=argv(at)entry=0x0) at autovacuum.c:1715
#22 0x0000000000798eed in StartAutoVacWorker () at autovacuum.c:1493
#23 0x000000000079fe49 in StartAutovacuumWorker () at postmaster.c:5534
#24 0x00000000007a0c44 in sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:5239
#25 <signal handler called>
#26 0x00007f4137225783 in __select_nocancel () from /lib64/libc.so.6
#27 0x00000000007a0fc5 in ServerLoop () at postmaster.c:1770
#28 0x00000000007a2361 in PostmasterMain (argc=argc(at)entry=3, argv=argv(at)entry=0x136e9d0) at postmaster.c:1478
#29 0x00000000006ed9c4 in main (argc=3, argv=0x136e9d0) at main.c:202

#2 0x0000000000962c5c in ExceptionalCondition (conditionName=conditionName(at)entry=0xaa6a32 "false", errorType=errorType(at)entry=0x9bad97 "FailedAssertion", fileName=fileName(at)entry=0x9c6207 "heapam.c",
lineNumber=lineNumber(at)entry=7803) at assert.c:69
#3 0x00000000004d9db9 in index_delete_sort_cmp (deltid2=0x7ffef18dcc90, deltid1=<optimized out>) at heapam.c:7803
#4 index_delete_sort (delstate=delstate(at)entry=0x7ffef18ddf10) at heapam.c:7844
#5 0x00000000004e9323 in heap_index_delete_tuples (rel=0x7f4138a672b8, delstate=0x7ffef18ddf10) at heapam.c:7502
#6 0x000000000050a512 in table_index_delete_tuples (delstate=0x7ffef18ddf10, rel=0x0) at ../../../../src/include/access/tableam.h:1329
#7 _bt_delitems_delete_check (rel=rel(at)entry=0x7f41389dbea0, buf=buf(at)entry=9183, heapRel=heapRel(at)entry=0x7f4138a672b8, delstate=delstate(at)entry=0x7ffef18ddf10) at nbtpage.c:1540
#8 0x00000000004fff30 in _bt_simpledel_pass (rel=rel(at)entry=0x7f41389dbea0, buffer=buffer(at)entry=9183, heapRel=heapRel(at)entry=0x7f4138a672b8, deletable=deletable(at)entry=0x7ffef18ddfb0, ndeletable=4, newitem=<optimized out>,
minoff=2, maxoff=215) at nbtinsert.c:2899
#9 0x0000000000500171 in _bt_delete_or_dedup_one_page (rel=rel(at)entry=0x7f41389dbea0, heapRel=heapRel(at)entry=0x7f4138a672b8, insertstate=insertstate(at)entry=0x7ffef18de3c0, simpleonly=simpleonly(at)entry=false,
checkingunique=checkingunique(at)entry=true, uniquedup=uniquedup(at)entry=false, indexUnchanged=indexUnchanged(at)entry=false) at nbtinsert.c:2710
#10 0x00000000005051ad in _bt_findinsertloc (rel=rel(at)entry=0x7f41389dbea0, insertstate=insertstate(at)entry=0x7ffef18de3c0, checkingunique=checkingunique(at)entry=true, indexUnchanged=indexUnchanged(at)entry=false,
stack=stack(at)entry=0x157dbc8, heapRel=heapRel(at)entry=0x7f4138a672b8) at nbtinsert.c:902
#11 0x00000000005055ad in _bt_doinsert (rel=rel(at)entry=0x7f41389dbea0, itup=itup(at)entry=0x157dcc8, checkUnique=checkUnique(at)entry=UNIQUE_CHECK_YES, indexUnchanged=indexUnchanged(at)entry=false, heapRel=heapRel(at)entry=0x7f4138a672b8)
at nbtinsert.c:256
#12 0x000000000050b16c in btinsert (rel=0x7f41389dbea0, values=<optimized out>, isnull=<optimized out>, ht_ctid=0x157d1bc, heapRel=0x7f4138a672b8, checkUnique=UNIQUE_CHECK_YES, indexUnchanged=false, indexInfo=0x1803df0)
at nbtree.c:200
#13 0x00000000004fb95d in index_insert (indexRelation=indexRelation(at)entry=0x7f41389dbea0, values=values(at)entry=0x7ffef18de520, isnull=isnull(at)entry=0x7ffef18de500, heap_t_ctid=heap_t_ctid(at)entry=0x157d1bc,
heapRelation=heapRelation(at)entry=0x7f4138a672b8, checkUnique=UNIQUE_CHECK_YES, indexUnchanged=indexUnchanged(at)entry=false, indexInfo=indexInfo(at)entry=0x1803df0) at indexam.c:193
#14 0x0000000000581ae6 in CatalogIndexInsert (indstate=indstate(at)entry=0x157c2f8, heapTuple=heapTuple(at)entry=0x157d1b8) at indexing.c:158
#15 0x0000000000581b9f in CatalogTupleInsert (heapRel=heapRel(at)entry=0x7f4138a672b8, tup=tup(at)entry=0x157d1b8) at indexing.c:231
#16 0x000000000057996f in InsertPgClassTuple (pg_class_desc=0x7f4138a672b8, new_rel_desc=new_rel_desc(at)entry=0x7f41389d9e30, new_rel_oid=<optimized out>, relacl=relacl(at)entry=0, reloptions=reloptions(at)entry=0) at heap.c:939
#17 0x0000000000579a07 in AddNewRelationTuple (pg_class_desc=pg_class_desc(at)entry=0x7f4138a672b8, new_rel_desc=new_rel_desc(at)entry=0x7f41389d9e30, new_rel_oid=new_rel_oid(at)entry=500038, new_type_oid=new_type_oid(at)entry=0,
reloftype=reloftype(at)entry=0, relowner=relowner(at)entry=16556, relkind=relkind(at)entry=116 't', relfrozenxid=17414307, relminmxid=730642, relacl=relacl(at)entry=0, reloptions=reloptions(at)entry=0) at heap.c:998
#18 0x000000000057a204 in heap_create_with_catalog (relname=relname(at)entry=0x7ffef18dea90 "pg_toast_500035", relnamespace=relnamespace(at)entry=20138, reltablespace=<optimized out>, relid=500038, relid(at)entry=0,
reltypeid=reltypeid(at)entry=0, reloftypeid=reloftypeid(at)entry=0, ownerid=16556, accessmtd=2, tupdesc=tupdesc(at)entry=0x1800420, cooked_constraints=cooked_constraints(at)entry=0x0, relkind=relkind(at)entry=116 't',
relpersistence=relpersistence(at)entry=116 't', shared_relation=shared_relation(at)entry=false, mapped_relation=mapped_relation(at)entry=false, oncommit=oncommit(at)entry=ONCOMMIT_NOOP, reloptions=reloptions(at)entry=0,
use_user_acl=use_user_acl(at)entry=false, allow_system_table_mods=allow_system_table_mods(at)entry=true, is_internal=is_internal(at)entry=true, relrewrite=relrewrite(at)entry=0, typaddress=typaddress(at)entry=0x0) at heap.c:1386
#19 0x00000000005a41e2 in create_toast_table (rel=rel(at)entry=0x7f41389ddfb8, toastOid=toastOid(at)entry=0, toastIndexOid=toastIndexOid(at)entry=0, reloptions=reloptions(at)entry=0, lockmode=lockmode(at)entry=8, check=check(at)entry=false,
OIDOldToast=OIDOldToast(at)entry=0) at toasting.c:249
#20 0x00000000005a4571 in CheckAndCreateToastTable (relOid=relOid(at)entry=500035, reloptions=reloptions(at)entry=0, lockmode=lockmode(at)entry=8, check=check(at)entry=false, OIDOldToast=OIDOldToast(at)entry=0) at toasting.c:88
#21 0x00000000005a45d3 in NewRelationCreateToastTable (relOid=relOid(at)entry=500035, reloptions=reloptions(at)entry=0) at toasting.c:75
#22 0x0000000000609e47 in create_ctas_internal (attrList=attrList(at)entry=0x17ff798, into=into(at)entry=0x1374c80) at createas.c:135
#23 0x000000000060a0cf in intorel_startup (self=0x1547678, operation=<optimized out>, typeinfo=0x17fc530) at createas.c:528
#24 0x0000000000694b1e in standard_ExecutorRun (queryDesc=queryDesc(at)entry=0x1569188, direction=direction(at)entry=ForwardScanDirection, count=count(at)entry=0, execute_once=execute_once(at)entry=true) at execMain.c:352
#25 0x00007f41307d2a2e in pgss_ExecutorRun (queryDesc=0x1569188, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at pg_stat_statements.c:1010
#26 0x00007f41303af648 in explain_ExecutorRun (queryDesc=0x1569188, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at auto_explain.c:320
#27 0x0000000000694c13 in ExecutorRun (queryDesc=queryDesc(at)entry=0x1569188, direction=direction(at)entry=ForwardScanDirection, count=count(at)entry=0, execute_once=execute_once(at)entry=true) at execMain.c:305
#28 0x000000000060a894 in ExecCreateTableAs (pstate=pstate(at)entry=0x14bd950, stmt=stmt(at)entry=0x1545140, params=params(at)entry=0x0, queryEnv=queryEnv(at)entry=0x0, qc=qc(at)entry=0x7ffef18df720) at createas.c:336
#29 0x00000000008378dc in ProcessUtilitySlow (pstate=pstate(at)entry=0x14bd950, pstmt=pstmt(at)entry=0x15ce250,
queryString=queryString(at)entry=0x1373df0 "\n-- do paging substitutions\nCREATE TEMPORARY TABLE SU AS\n\tSELECT ...

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2022-09-12 02:10:47 Re: pg15b4: FailedAssertion("TransactionIdIsValid(xmax)
Previous Message Justin Pryzby 2022-09-12 00:54:43 Re: Background writer and checkpointer in crash recovery