Re: strange nbtree corruption report

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: strange nbtree corruption report
Date: 2011-11-22 04:14:33
Message-ID: 1731.1321935273@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> We got a very strange nbtree corruption report some time ago. This was
> a btree index on a vey high churn table -- entries are updated and
> deleted very quickly, so the index grows very large and also shrinks
> quickly (AFAICT this is a work queue of sorts).

> The most strange thing of all is that there was this error:

> ERROR: left link changed unexpectedly in block 3378 of index "index_name"
> CONTEXT: automatic vacuum of table "table_name"

> This was reported not once, but several dozens of times, by each new
> autovacuum worker that tried to vacuum the table.

> As far as I can see, there is just no way for this to happen ... much
> less happen repeatedly.

It's not hard to believe that that would happen repeatedly given a
corrupted set of sibling links, eg deletable page A links left to page
B, which links right to C, which links right to A. The question is how
the index got into such a state. A dropped update during a page split
would explain it (ie, B used to be A's left sibling, then at some point
B got split into B and C, but A's left-link never got updated on disk).
I wonder how reliable their disk+filesystem is ...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-11-22 04:17:40 Re: strange nbtree corruption report
Previous Message Bruce Momjian 2011-11-22 04:05:38 Re: Rename a database that has connections