Re: Duplicate values found when reindexing unique index

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Mason Hale" <masonhale(at)gmail(dot)com>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Gregory Stark" <stark(at)enterprisedb(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Duplicate values found when reindexing unique index
Date: 2007-12-31 16:53:54
Message-ID: 6800.1199120034@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

"Mason Hale" <masonhale(at)gmail(dot)com> writes:
> On Dec 31, 2007 9:48 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Do you by any chance still have 000000010000042200000058 and
>> 000000010000042200000059 archived? If so it would be useful to
>> look at the first dozen lines of "od -x" dump of each of them.

> Yes, I do. Here's the output:

> [postgres(at)dev-db-2 wal_archive]$ od -x 000000010000042200000058 | head -n15
> 0000000 d05e 0002 0001 0000 0423 0000 0000 c100
> 0000020 f7df 472e e701 4728 0000 0100 2000 0000
> 0000040 a1db 81e6 0423 0000 0068 c000 0000 0000

Hmm, something wrong here. The data looks sane, but the page header
address ought to be in 0000000100000423000000C1 if I'm not mistaken.
And what's even odder is that the second page of the file has got

> 0020000 d05e 0000 0001 0000 0422 0000 2000 5800
> 0020020 53c2 48bc 0422 0000 1fbc 5800 ce6f 2edd
> 0020040 003a 0000 001e 0000 0b00 0000 067f 0000
> 0020060 41be 0000 1ff8 0015 0000 0186 0007 0000
> 0020100 4337 000a 000c 008f 0122 0000 db84 d429
> 0020120 0422 0000 2010 5800 ce6f 2edd 003a 0000

which *is* what you'd expect to find at the second block of
000000010000042200000058.

And then we have

> [postgres(at)dev-db-2 wal_archive]$ od -x 000000010000042200000059 | head -n15
> 0000000 d05e 0001 0001 0000 006b 0000 6000 69dc
> 0000020 12ae 0000 6380 0024 0010 375a 21cd 1174
> 0000040 4001 0001 637c 0058 0010 375a 21cd 1174
> 0000060 4001 0001 6355 0010 0010 375a 21cd 1174
> 0000100 4001 0001 631d 005a 0010 375a 21cd 1174

which is just completely off in left field --- that's not even close to
being the right sequence number, plus it's not a valid
first-page-of-file header (which is what the xlog complaint message
was about). But on its own terms it might be valid data for someplace
in the middle of 000000010000006B00000069.

It might be worth trawling through both files to check the page headers
(every 8K) and see which ones agree with expectation and which don't.
The state of the ...0058 file might be explained by the theory that
you'd archived it a bit too late (after the first page had been
overwritten with newer WAL data), but the ...0059 file seems just plain
broken. I am starting to wonder about hardware or OS misfeasance
causing writes to be lost or misdirected.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Simon Riggs 2007-12-31 17:20:15 Re: Duplicate values found when reindexing unique index
Previous Message Mason Hale 2007-12-31 16:40:30 Re: Duplicate values found when reindexing unique index