From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: 9.4 checksum error in recovery with btree index |
Date: | 2014-05-18 03:30:11 |
Message-ID: | CAMkU=1wreyWpBipAwo=DYYxHEu5k2Z3WVLJJXd-45F0WtTxFgA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Saturday, May 17, 2014, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
wrote:
> On 05/17/2014 12:28 AM, Jeff Janes wrote:
>
>> More fun with my torn page injection test program on 9.4.
>>
>> 24171 2014-05-16 14:00:44.934 PDT:WARNING: 01000: page verification
>> failed, calculated checksum 21100 but expected 3356
>> 24171 2014-05-16 14:00:44.934 PDT:CONTEXT: xlog redo split_l: rel
>> 1663/16384/16405 left 35191, right 35652, next 34666, level 0, firstright
>> 192
>> 24171 2014-05-16 14:00:44.934 PDT:LOCATION: PageIsVerified,
>> bufpage.c:145
>> 24171 2014-05-16 14:00:44.934 PDT:FATAL: XX001: invalid page in block
>> 34666 of relation base/16384/16405
>> 24171 2014-05-16 14:00:44.934 PDT:CONTEXT: xlog redo split_l: rel
>> 1663/16384/16405 left 35191, right 35652, next 34666, level 0, firstright
>> 192
>> 24171 2014-05-16 14:00:44.934 PDT:LOCATION: ReadBuffer_common,
>> bufmgr.c:483
>>
>>
>> I've seen this twice now, the checksum failure was both times for the
>> block
>> labelled "next" in the redo record. Is this another case where the block
>> needs to be reinitialized upon replay?
>>
>
> Hmm, it looks like I fumbled the numbering of the backup blocks in the
> b-tree split WAL record (in 9.4). I blame the comments; the comments where
> the record is generated numbers the backup blocks starting from 1, but
> XLR_BKP_BLOCK(x) and RestoreBackupBlock(...) used in replay number them
> starting from 0.
>
> Attached is a patch that I think fixes them. In addition to the
> rnext-reference, clearing the incomplete-split flag in the child page, had
> a similar numbering mishap.
>
The seems to have fixed it.
Thanks,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2014-05-18 04:00:11 | Re: 9.4 beta1 crash on Debian sid/i386 |
Previous Message | Andres Freund | 2014-05-17 23:30:31 | Re: pgbench is broken on strict-C89 compilers |