Re: unique index violation after pg_upgrade to PG10

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Kenneth Marshall <ktm(at)rice(dot)edu>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: unique index violation after pg_upgrade to PG10
Date: 2017-10-24 20:11:44
Message-ID: 20171024201144.GR21735@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 24, 2017 at 12:31:49PM -0700, Peter Geoghegan wrote:
> On Tue, Oct 24, 2017 at 11:48 AM, Kenneth Marshall <ktm(at)rice(dot)edu> wrote:
> >> Really ? pg_repack "found" and was victim to the duplicate keys, and rolled
> >> back its work. The CSV logs clearly show that our application INSERTed rows
> >> which are duplicates.
> >>
> >> [pryzbyj(at)database ~]$ rpm -qa pg_repack10
> >> pg_repack10-1.4.2-1.rhel6.x86_64
> >>
> >> Justin
> >
> > Hi Justin,
> >
> > I just dealt with a similar problem with pg_repack and a PostgreSQL 9.5 DB,
> > the exact same error. It seemed to caused by a tuple visibility issue that
> > allowed the "working" unique index to be built, even though a duplicate row
> > existed. Then the next pg_repack would fail with the error you got. In our
> > case I needed the locality of reference to keep the DB performance acceptable
> > and it was not a critical issue if there was a duplicate. We would remove the
> > duplicates if we had a failure. We never had a problem with the NO pg_repack
> > scenarios.
>
> A new, enhanced version of the corruption detection tool amcheck is
> now available, and has both apt + yum packages available:
>
> https://github.com/petergeoghegan/amcheck
>
> Unlike the version in Postgres 10, this enhanced version (V1.2) has
> "heapallindexed" verification, which is really what you want here. If
> you install it, and run it against the unique index in question (with
> "heapallindexed" verification), that could help. It might provide a
> more useful diagnostic message.
>
> This is very new, so do let me know how you get on if you try it out.

I started an instance connected to a copy of the LVM snapshot I saved:
[pryzbyj(at)database ~]$ sudo -u postgres /usr/pgsql-10/bin/postmaster -c port=5678 -D /mnt/10/data

[pryzbyj(at)database amcheck]$ psql --port 5678 ts -c "SELECT bt_index_check('sites_idx'::regclass::oid, heapallindexed=>True)"
ERROR: high key invariant violated for index "sites_idx"
DETAIL: Index tid=(1,37) points to heap tid=(0,97) page lsn=0/0.
[pryzbyj(at)database amcheck]$ psql --port 5678 ts -c "SELECT bt_index_parent_check('sites_idx'::regclass::oid, heapallindexed=>True)"
ERROR: high key invariant violated for index "sites_idx"
DETAIL: Index tid=(1,37) points to heap tid=(0,97) page lsn=0/0.

ts=# SELECT * FROM page_header(get_raw_page('sites_idx', 1));
lsn | 0/0
checksum | 0
flags | 0
lower | 872
upper | 1696
special | 8176
pagesize | 8192
version | 4
prune_xid | 0

ts=# SELECT * FROM page_header(get_raw_page('sites', 0));
lsn | 1FB/AC5A4908
checksum | 0
flags | 5
lower | 436
upper | 464
special | 8192
pagesize | 8192
version | 4
prune_xid | 0

ts=# SELECT * FROM bt_page_items(get_raw_page('sites_idx', 1));

itemoffset | 48
ctid | (1,37)
itemlen | 32
nulls | f
vars | t
data | 1b 43 52 43 4c 4d 54 2d 43 45 4d 53 30 0b 31 31 31 31 00 00 00 00 00 00

itemoffset | 37
ctid | (0,97)
itemlen | 24
nulls | f
vars | t
data | 1b 43 52 43 4c 4d 54 2d 43 45 4d 53 30 03 00 00

..which I gather just verifies that the index is corrupt, not sure if there's
anything else to do with it? Note, we've already removed the duplicate rows.

Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-10-24 20:19:17 Re: Domains and arrays and composites, oh my
Previous Message Peter Geoghegan 2017-10-24 19:31:49 Re: unique index violation after pg_upgrade to PG10