Re: pg14b2: FailedAssertion("_bt_posting_valid(nposting)", File: "nbtdedup.c", ...

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg14b2: FailedAssertion("_bt_posting_valid(nposting)", File: "nbtdedup.c", ...
Date: 2021-06-27 22:34:56
Message-ID: 20210627223456.GA21326@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 27, 2021 at 03:08:13PM -0700, Peter Geoghegan wrote:
> Can you please amcheck all of the indexes?

ts=# SELECT bt_index_check('child.alarms_null_alarm_clear_time_idx'::regclass);
ERROR: item order invariant violated for index "alarms_null_alarm_clear_time_idx"
DETAIL: Lower index tid=(1,77) (points to heap tid=(29,9)) higher index tid=(1,78) (points to heap tid=(29,9)) page lsn=80/4B9C69D0.

ts=# SELECT itemoffset, ctid ,itemlen, nulls, vars, dead, htid FROM bt_page_items('child.alarms_null_alarm_clear_time_idx', 1);
itemoffset | ctid | itemlen | nulls | vars | dead | htid
------------+-----------+---------+-------+------+------+---------
...
77 | (29,9) | 16 | t | f | f | (29,9)
78 | (29,9) | 16 | t | f | f | (29,9)

ts=# SELECT lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid, t_infomask2, t_infomask, t_hoff, t_bits, t_oid FROM heap_page_items(get_raw_page('child.alarms_null', 29));
lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid
----+--------+----------+--------+--------+----------+----------+---------+-------------+------------+--------+------------------------------------------+-------
1 | 6680 | 1 | 1512 | 88669 | 27455486 | 44 | (29,1) | 8225 | 10691 | 32 | 1100001111111111111111101111111110000000 |
2 | 6 | 2 | 0 | | | | | | | | |
3 | 5168 | 1 | 1512 | 87374 | 27455479 | 37 | (29,3) | 8225 | 10691 | 32 | 1100001111111111111111101111111110000000 |
4 | 4192 | 1 | 976 | 148104 | 27574887 | 0 | (29,4) | 8225 | 10695 | 32 | 1100001111111111111111101111111110000000 |
5 | 10 | 2 | 0 | | | | | | | | |
6 | 3216 | 1 | 976 | 148137 | 27574888 | 0 | (29,6) | 40993 | 10695 | 32 | 1100001111111111111111101111111110000000 |
7 | 8 | 2 | 0 | | | | | | | | |
8 | 2240 | 1 | 976 | 47388 | 27574858 | 7 | (29,8) | 40993 | 10695 | 32 | 1100001111111111111111101111111110000000 |
9 | 0 | 3 | 0 | | | | | | | | |
10 | 1264 | 1 | 976 | 148935 | 27574889 | 0 | (29,10) | 40993 | 10695 | 32 | 1100001111111111111111101111111110000000 |
11 | 0 | 3 | 0 | | | | | | | | |
12 | 0 | 3 | 0 | | | | | | | | |
(12 rows)

(gdb) fr 4
#4 0x0000000000509a14 in _bt_insertonpg (rel=rel(at)entry=0x7f6dfd3cd628, itup_key=itup_key(at)entry=0x2011b40, buf=15, cbuf=cbuf(at)entry=0, stack=stack(at)entry=0x2011bd8, itup=0x2011c00, itup(at)entry=0x200d608, itemsz=16,
newitemoff=2, postingoff=62, split_only_page=split_only_page(at)entry=false) at nbtinsert.c:1174
1174 in nbtinsert.c
(gdb) p page
$5 = 0x7f6de58e0e00 "\200"
(gdb) dump binary memory /tmp/dump_block.page page (page + 8192)

ts=# SELECT lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid, t_infomask2, t_infomask, t_hoff, t_bits, t_oid FROM heap_page_items(pg_read_binary_file('/tmp/dump_block.page')) WHERE t_xmin IS NOT NULL;
lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid
-----+--------+----------+--------+---------+------------+----------+--------+-------------+------------+--------+--------+-------
1 | 8152 | 1 | 24 | 1048576 | 2685931521 | 0 | (0,0) | 0 | 120 | 1 | |
2 | 7288 | 1 | 864 | 1048576 | 2740985997 | 0 | (0,0) | 0 | 1 | 0 | |
67 | 6368 | 1 | 920 | 1048576 | 2744656022 | 0 | (0,0) | 33 | 4 | 0 | |
137 | 5056 | 1 | 1312 | 1048576 | 2770346200 | 0 | (0,0) | 69 | 6 | 0 | |
142 | 4608 | 1 | 448 | 1048576 | 2713722952 | 0 | (0,0) | 107 | 4 | 0 | |
(5 rows)

I didn't change the kernel here, nor on the previous bug report - it was going
to be my "next step", until I found the stuck autovacuum, and I mentioned it
for context, but probably just confused things.

--
Justin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2021-06-27 23:23:25 Re: What is "wraparound failure", really?
Previous Message Peter Geoghegan 2021-06-27 22:18:19 Re: pg14b2: FailedAssertion("_bt_posting_valid(nposting)", File: "nbtdedup.c", ...