Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

From: Greg Stark <stark(at)mit(dot)edu>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>
Subject: Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?
Date: 2014-02-20 13:25:35
Message-ID: CAM-w4HPTOeMT4KP0OJK+mGgzgcTOtLRTvFZyvD0O4aH-7dxo3Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I have a database where a a couple rows don't appear in index scans
but do appear in sequential scans. It looks like the same problem as
Peter reported but this is a different database. I've extracted all
the xlogdump records and below are the ones I think are relevant. You
can see that lp 2 gets a few HOT updates and concurrently has someone
create a MultiXact NO KEY UPDATE lock while one of those HOT updates
is pending but not committed. The net result seems to be that the ctid
update chain got broken. The index of course points to the head of the
HOT chain so it doesn't find the live tail whereas the sequential scan
picks it up.

I don't see any evidence of MultiXactId wraparound, the members run to
001F and the offsets run to 000B. This is on a standby that's been
activated but afaik that shouldn't change these files any more right?

rmgr: Heap len (rec/tot): 235/ 267, tx: 5943845, lsn:
FD/2F0A3640, prev FD/2F0A3600, bkp: 0000, desc: insert: rel
1663/16385/212653; tid 13065/2
rmgr: Transaction len (rec/tot): 12/ 44, tx: 5943845, lsn:
FD/2F0A8178, prev FD/2F0A8148, bkp: 0000, desc: commit: 2014-02-19
20:41:23.698513 UTC
rmgr: Heap len (rec/tot): 25/ 57, tx: 5943847, lsn:
FD/2F0AA440, prev FD/2F0AA3F8, bkp: 0000, desc: lock 5943847: rel
1663/16385/212653; tid 13065/2 LOCK_ONLY KEYSHR_LOCK
rmgr: Transaction len (rec/tot): 12/ 44, tx: 5943847, lsn:
FD/2F0AA480, prev FD/2F0AA440, bkp: 0000, desc: commit: 2014-02-19
20:41:23.713969 UTC
rmgr: Heap len (rec/tot): 291/ 323, tx: 5943849, lsn:
FD/2F0ADFC0, prev FD/2F0ADF90, bkp: 0000, desc: hot_update: rel
1663/16385/212653; tid 13065/2 xmax 5943849 ; new tid 13065/3 xmax 0
rmgr: Heap2 len (rec/tot): 25/ 57, tx: 5943851, lsn:
FD/2F0AE450, prev FD/2F0AE408, bkp: 0000, desc: lock updated: xmax
5943851 msk 000a; rel 1663/16385/212653; tid 13065/3
rmgr: MultiXact len (rec/tot): 28/ 60, tx: 5943851, lsn:
FD/2F0AE490, prev FD/2F0AE450, bkp: 0000, desc: create mxid 728896
offset 1632045 nmembers 2: 5943849 (nokeyupd) 5943851 (keysh)
rmgr: Heap len (rec/tot): 25/ 57, tx: 5943851, lsn:
FD/2F0AE4D0, prev FD/2F0AE490, bkp: 0000, desc: lock 728896: rel
1663/16385/212653; tid 13065/2 IS_MULTI EXCL_LOCK
rmgr: Transaction len (rec/tot): 12/ 44, tx: 5943849, lsn:
FD/2F0AE510, prev FD/2F0AE4D0, bkp: 0000, desc: commit: 2014-02-19
20:41:23.744989 UTC
rmgr: Transaction len (rec/tot): 12/ 44, tx: 5943851, lsn:
FD/2F0AE570, prev FD/2F0AE540, bkp: 0000, desc: commit: 2014-02-19
20:41:23.746820 UTC
rmgr: Heap len (rec/tot): 306/ 338, tx: 5943879, lsn:
FD/2F103788, prev FD/2F103758, bkp: 0000, desc: hot_update: rel
1663/16385/212653; tid 13065/3 xmax 5943879 ; new tid 13065/4 xmax 0
rmgr: Transaction len (rec/tot): 12/ 44, tx: 5943879, lsn:
FD/2F1038E0, prev FD/2F103788, bkp: 0000, desc: commit: 2014-02-19
20:41:24.580827 UTC
rmgr: Heap len (rec/tot): 306/ 338, tx: 5943880, lsn:
FD/2F103910, prev FD/2F1038E0, bkp: 0000, desc: hot_update: rel
1663/16385/212653; tid 13065/4 xmax 5943880 ; new tid 13065/7 xmax 0
rmgr: Transaction len (rec/tot): 12/ 44, tx: 5943880, lsn:
FD/2F106070, prev FD/2F106030, bkp: 0000, desc: commit: 2014-02-19
20:41:24.617048 UTC

lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 |
t_ctid | t_infomask2 | t_infomask | t_hoff |
----+--------+----------+--------+---------+---------+----------+------------+-------------+------------+--------+-
2 | 3424 | 1 | 232 | 5943845 | 728896 | 0 |
(13065,2) | 32 | 4419 | 32 |
3 | 3152 | 1 | 272 | 5943849 | 5943879 | 0 |
(13065,4) | 49184 | 9475 | 32 |
4 | 2864 | 1 | 287 | 5943879 | 5943880 | 0 |
(13065,7) | 49184 | 9475 | 32 |
7 | 2576 | 1 | 287 | 5943880 | 0 | 0 |
(13065,7) | 32800 | 10499 | 32 |

--
greg

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Emre Hasegeli 2014-02-20 13:30:48 Re: GiST support for inet datatypes
Previous Message Marti Raudsepp 2014-02-20 11:58:00 Re: Selecting large tables gets killed