Skip site navigation (1) Skip section navigation (2)

Re: Index corruption

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marc Munro <marc(at)bloodnok(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Index corruption
Date: 2006-06-29 23:27:07
Message-ID: 18191.1151623627@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackers
I wrote:
> What I speculate right at the moment is that we are not looking at index
> corruption at all, but at heap corruption: somehow, the first insertion
> into ctid (27806,2) got lost and the same ctid got re-used for the next
> inserted row.  We fixed one bug like this before ...

Further study makes this look more probable.  It seems that most of the
activity around the trouble spot consists of transactions inserting
exactly 13 rows into the table; for instance here are traces of two
successive transactions:

Block	Item	XMIN		CMIN

27800	7	600921860	3
27800	8	600921860	7
27800	9	600921860	11
27800	10	600921860	15
27800	11	600921860	19
27800	12	600921860	23
27800	13	600921860	27
27800	14	600921860	31
27800	15	600921860	35
27800	16	600921860	39
27800	17	600921860	43
27800	18	600921860	47
27800	19	600921860	51

27800	20	600921870	3
27800	21	600921870	7
27800	22	600921870	11
27800	23	600921870	15
27800	24	600921870	19
27800	25	600921870	23
27800	26	600921870	27
27800	27	600921870	31
27800	28	600921870	35
27800	29	600921870	39
27800	30	600921870	43
27800	31	600921870	47
27800	32	600921870	51

The pattern of CMIN values is the same for all these transactions.
But look at the rows inserted by 600921856 and 600921858, the two
transactions that seem to be involved with the problem on page
27086:

27787	50	600921856	3
27795	41	600921858	3
27795	42	600921858	7
27795	43	600921858	11
27795	44	600921858	15
27795	45	600921858	19
27795	46	600921858	23
27795	47	600921858	27
27795	48	600921858	31
27795	49	600921858	35
27795	50	600921858	39
27795	51	600921858	43
27806	1	600921858	47
27806	2	600921856	15
27806	3	600921856	19
27806	4	600921856	23
27806	5	600921856	27
27806	6	600921856	31
27806	7	600921856	35
27806	8	600921856	39
27806	9	600921856	43
27806	10	600921856	47
27806	11	600921856	51
27806	12	600921858	51

(27787,50) and (27795,51) are both the last rows on their pages.  What's
evidently happened is that the two transactions filled those pages and
then both seized on 27806 as the next page to insert into.  I think that
600921856 tried to insert its CMIN 7 and 11 rows as items 1 and 2 on
that page, and then something wiped the page, then 600921858 inserted
its CMIN 47 row as item 1, and then 600921856 got control back and
finished inserting its rows.  Further study of the indexes shows that
there are two entries in each index pointing to each of (27806,1) and
(27806,2) --- but since the xxid values are different for the two
(27806,1) entries, those didn't show up as duplicates in my first look.

Given the apparent connection to vacuuming, this is looking a WHOLE
lot like this bug fixed in 8.0.3:

2005-05-07 17:32  tgl

	* src/backend/: access/heap/hio.c, commands/vacuumlazy.c
	(REL7_3_STABLE), access/heap/hio.c, access/nbtree/nbtpage.c,
	access/nbtree/nbtree.c, commands/vacuumlazy.c (REL7_4_STABLE),
	access/heap/hio.c, commands/vacuumlazy.c (REL7_2_STABLE),
	access/heap/hio.c, access/nbtree/nbtpage.c, access/nbtree/nbtree.c,
	commands/vacuumlazy.c (REL8_0_STABLE), access/heap/hio.c,
	access/nbtree/nbtpage.c, access/nbtree/nbtree.c,
	commands/vacuumlazy.c: Repair very-low-probability race condition
	between relation extension and VACUUM: in the interval between
	adding a new page to the relation and formatting it, it was
	possible for VACUUM to come along and decide it should format the
	page too.  Though not harmful in itself, this would cause data loss
	if a third transaction were able to insert tuples into the vacuumed
	page before the original extender got control back.

Are you *certain* this slave isn't running 8.0.2 or older?  If you can
verify that, then I guess we need to look for another mechanism that
could cause the same kind of thing.

			regards, tom lane

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2006-06-29 23:34:55
Subject: Re: [HACKERS] Non-transactional pg_class, try 2
Previous:From: Marc MunroDate: 2006-06-29 22:06:16
Subject: Re: Index corruption

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group