Skip site navigation (1) Skip section navigation (2)

Re: right sibling is not next child

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Peter Brant" <Peter(dot)Brant(at)wicourts(dot)gov>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: right sibling is not next child
Date: 2006-04-12 20:28:41
Message-ID: 3355.1144873721@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-bugs
"Peter Brant" <Peter(dot)Brant(at)wicourts(dot)gov> writes:
> One thing that seems strange to me is that the original crash on
> Thursday failed on Panel_pkey, but my "vacuum analyze verbose" on a copy
> of the crashed database failed on MaintCode /
> pg_statistic_relid_att_index.

I can't find anything particularly wrong with
pg_statistic_relid_att_index.  There are two index entries for many of
the key values, but that's unsurprising if a database-wide vacuum
analyze had been done or in progress --- just-obsoleted pg_statistic
rows wouldn't have been vacuumed away yet.  I'm inclined to think that
this problem is a side-effect of the fact that you had to use
pg_resetxlog; there was probably an update to pg_statistic that got
lost.  If you want to continue experimenting with this database,
I'd suggest TRUNCATE'ing pg_statistic and rebuilding it via another
VACUUM ANALYZE run.

> I'll send over pg_statistic_relid_att_index and Panel_pkey.  Showing
> the keys to Panel_pkey is no problem.

My little index printing tool shows these entries in Panel_pkey at btree
level zero:

...
40	2006	RES032706 	CH0327RES 		high key on page 606
40	2006	RES032706 	RES032706 		383	11
40	2006	RES032706 	RES032706 		high key on page 608
40	2006	RES040306 	CC0403RES 		507	14
40	2006	RES040306 	CCC0403RES		551	1
40	2006	RES040306 	CCC0403RES		high key on page 601
40	2006	RES040306 	RES040306 		500	1
40	2006	RES040306 	RES040306 		high key on page 640
40	2006	RES040306 	RES040306 		high key on page 636
40	2006	RES040306 	RES040306 		high key on page 635
41	0001	2000POOL  	0001      		159	3
41	0001	2000POOL  	0002      		159	4
41	0001	2000POOL  	0003      		159	5
...

(The first four columns are the key values of this index; the last two
are the pointed-to heap tuple's page/line location.  High keys are
printed after any data keys on the page.)  The down-links at level one
look like:

...
40	2006	RES032706 	CCC0327RES		606	1
40	2006	RES032706 	CH0327RES 		608	1
40	2006	RES032706 	RES032706 		601	1
40	2006	RES040306 	CCC0403RES		640	1
40	2006	RES040306 	RES040306 		635	1
40	2006	RES040306 	RES040306 		636	1
40	2006	RES040306 	RES040306 		635	1
40	2006	RES040306 	RES040306 		629	1
41	0405	0405      	0105      		166	1
...

This is fairly interesting because we've got three pages with the same
boundary key.  The bogus entry for page 635 has been inserted where
you'd expect it to get inserted if the insertion were being done on the
basis of key comparison.  (We usually insert a new entry in front of any
ones with the same key.)  But we never do insertions on non-leaf pages
by key comparison!  Upper-level entries are only made during page
splits, by _bt_insert_parent, and that always works by locating the
down-link to the page just split and adding the new entry just after it.

One thing I realize after seeing this is that the index corruption might
be of long standing: the way that the btree search algorithms work, no
search would ever have descended directly to 635 or 636, but instead to
640 and then traverse right from there.  So you'd not have noticed any
malfunction, until the time came to try to delete the page.  That means
we shouldn't assume that the problem was created recently.

Now that we know the problem is associated with keys '40 2006 RES040306
RES040306', can you check your logs and see if anything interesting
happened around the time those keys were being inserted?  (Note: if
Panel_pkey really is a primary key, ie unique, it might seem odd for
there to be several pages worth of identical entries --- this would
imply that the same record had been updated several hundred times
between two vacuums.)

I still kinda like the theory that the extra key got in there because
btree_xlog_cleanup fired inappropriately ... mainly because I can't
think of any other theories ... but I'm still baffled about the details.
Anyone have any ideas?

			regards, tom lane

In response to

pgsql-bugs by date

Next:From: Bruce MomjianDate: 2006-04-12 21:46:39
Subject: Re: contrib/intarray/_int_gist.c
Previous:From: Peter BrantDate: 2006-04-12 16:40:27
Subject: Re: right sibling is not next child

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group