Re: strange nbtree corruption report

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: strange nbtree corruption report
Date: 2011-11-28 21:44:05
Message-ID: 1322516294-sup-4592@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Excerpts from Tom Lane's message of mar nov 22 01:14:33 -0300 2011:
> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> > We got a very strange nbtree corruption report some time ago. This was
> > a btree index on a vey high churn table -- entries are updated and
> > deleted very quickly, so the index grows very large and also shrinks
> > quickly (AFAICT this is a work queue of sorts).
>
> > The most strange thing of all is that there was this error:
>
> > ERROR: left link changed unexpectedly in block 3378 of index "index_name"
> > CONTEXT: automatic vacuum of table "table_name"
>
> > This was reported not once, but several dozens of times, by each new
> > autovacuum worker that tried to vacuum the table.
>
> > As far as I can see, there is just no way for this to happen ... much
> > less happen repeatedly.
>
> It's not hard to believe that that would happen repeatedly given a
> corrupted set of sibling links, eg deletable page A links left to page
> B, which links right to C, which links right to A. The question is how
> the index got into such a state. A dropped update during a page split
> would explain it (ie, B used to be A's left sibling, then at some point
> B got split into B and C, but A's left-link never got updated on disk).
> I wonder how reliable their disk+filesystem is ...

While summarizing this, a (relatively) frequent problem with unique
indexes came to my mind: there would be a UNIQUE index but when the
admin tries to rebuild it for whatever reason, duplicate values are
found. We've seen dozens of reports of this kind of problem (in the
pgsql lists I mean -- I don't think we've seen this problem in this
customer's servers). I wonder if it's related, because it seems pretty
much the same mechanism: sometimes, a btree index insert would be
randomly forgotten (its page write lost in vacuum, so to speak), and
thus when the second heap item comes along, there's no entry in the
index and the insert completes, and there you have your duplicate value.

I wonder if it would be worthwhile to build some sort of tool to scan
the heap and ensure that there are index entries for all heap items,
just to test the hypothesis. Not that this would enlighten on the
source of the actual problem.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2011-11-28 21:47:52 Re: [PATCH] Replace a long chain of if's in eval_const_expressions_mutator by a switch()
Previous Message Andrew Dunstan 2011-11-28 21:42:56 Re: pg_upgrade automatic testing