From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Cc: | Kenneth Marshall <ktm(at)rice(dot)edu>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: unique index violation after pg_upgrade to PG10 |
Date: | 2017-10-24 20:11:44 |
Message-ID: | 20171024201144.GR21735@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Oct 24, 2017 at 12:31:49PM -0700, Peter Geoghegan wrote:
> On Tue, Oct 24, 2017 at 11:48 AM, Kenneth Marshall <ktm(at)rice(dot)edu> wrote:
> >> Really ? pg_repack "found" and was victim to the duplicate keys, and rolled
> >> back its work. The CSV logs clearly show that our application INSERTed rows
> >> which are duplicates.
> >>
> >> [pryzbyj(at)database ~]$ rpm -qa pg_repack10
> >> pg_repack10-1.4.2-1.rhel6.x86_64
> >>
> >> Justin
> >
> > Hi Justin,
> >
> > I just dealt with a similar problem with pg_repack and a PostgreSQL 9.5 DB,
> > the exact same error. It seemed to caused by a tuple visibility issue that
> > allowed the "working" unique index to be built, even though a duplicate row
> > existed. Then the next pg_repack would fail with the error you got. In our
> > case I needed the locality of reference to keep the DB performance acceptable
> > and it was not a critical issue if there was a duplicate. We would remove the
> > duplicates if we had a failure. We never had a problem with the NO pg_repack
> > scenarios.
>
> A new, enhanced version of the corruption detection tool amcheck is
> now available, and has both apt + yum packages available:
>
> https://github.com/petergeoghegan/amcheck
>
> Unlike the version in Postgres 10, this enhanced version (V1.2) has
> "heapallindexed" verification, which is really what you want here. If
> you install it, and run it against the unique index in question (with
> "heapallindexed" verification), that could help. It might provide a
> more useful diagnostic message.
>
> This is very new, so do let me know how you get on if you try it out.
I started an instance connected to a copy of the LVM snapshot I saved:
[pryzbyj(at)database ~]$ sudo -u postgres /usr/pgsql-10/bin/postmaster -c port=5678 -D /mnt/10/data
[pryzbyj(at)database amcheck]$ psql --port 5678 ts -c "SELECT bt_index_check('sites_idx'::regclass::oid, heapallindexed=>True)"
ERROR: high key invariant violated for index "sites_idx"
DETAIL: Index tid=(1,37) points to heap tid=(0,97) page lsn=0/0.
[pryzbyj(at)database amcheck]$ psql --port 5678 ts -c "SELECT bt_index_parent_check('sites_idx'::regclass::oid, heapallindexed=>True)"
ERROR: high key invariant violated for index "sites_idx"
DETAIL: Index tid=(1,37) points to heap tid=(0,97) page lsn=0/0.
ts=# SELECT * FROM page_header(get_raw_page('sites_idx', 1));
lsn | 0/0
checksum | 0
flags | 0
lower | 872
upper | 1696
special | 8176
pagesize | 8192
version | 4
prune_xid | 0
ts=# SELECT * FROM page_header(get_raw_page('sites', 0));
lsn | 1FB/AC5A4908
checksum | 0
flags | 5
lower | 436
upper | 464
special | 8192
pagesize | 8192
version | 4
prune_xid | 0
ts=# SELECT * FROM bt_page_items(get_raw_page('sites_idx', 1));
itemoffset | 48
ctid | (1,37)
itemlen | 32
nulls | f
vars | t
data | 1b 43 52 43 4c 4d 54 2d 43 45 4d 53 30 0b 31 31 31 31 00 00 00 00 00 00
itemoffset | 37
ctid | (0,97)
itemlen | 24
nulls | f
vars | t
data | 1b 43 52 43 4c 4d 54 2d 43 45 4d 53 30 03 00 00
..which I gather just verifies that the index is corrupt, not sure if there's
anything else to do with it? Note, we've already removed the duplicate rows.
Justin
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2017-10-24 20:19:17 | Re: Domains and arrays and composites, oh my |
Previous Message | Peter Geoghegan | 2017-10-24 19:31:49 | Re: unique index violation after pg_upgrade to PG10 |