Re: Corruption with duplicate primary key

From: Alex Adriaanse <alex(at)oseberg(dot)io>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Corruption with duplicate primary key
Date: 2019-12-11 23:42:45
Message-ID: SN6PR03MB359873DE51E9CD69837E5117A95A0@SN6PR03MB3598.namprd03.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, December 5, 2019 at 5:34 PM Peter Geoghegan wrote:
> > We have a Postgres 10 database that we recently upgraded to Postgres 12 using pg_upgrade. We recently discovered that there are rows in one of the tables that have duplicate primary keys:
>
> What's the timeline here? In other words, does it look like these rows
> were updated and/or deleted before, around the same time as, or after
> the upgrade?

The Postgres 12 upgrade was performed on 2019-11-22, so the affected rows were modified after this upgrade (although some of the rows were originally inserted before then, before they were modified/duplicated).

> > This database runs inside Docker, with the data directory bind-mounted to a reflink-enabled XFS filesystem. The VM is running Debian's 4.19.16-1~bpo9+1 kernel inside an AWS EC2 instance. We have Debezium stream data from this database via pgoutput.
>
> That seems suspicious, since reflink support for XFS is rather immature.

Good point. Looking at kernel commits since 4.19.16 it appears that there have been a few bug fixes in later kernel versions that address a few XFS corruption issues. Regardless of whether FS bugs are responsible of this corruption I'll plan on upgrading to a newer kernel.

> How did you invoke pg_upgrade? Did you use the --link (hard link) option?

Yes, we first created a backup using "cp -a --reflink=always", ran initdb on the new directory, and then upgraded using "pg_upgrade -b ... -B ... -d ... -D -k".

Alex

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alex Adriaanse 2019-12-11 23:46:40 Re: Corruption with duplicate primary key
Previous Message Robert Haas 2019-12-11 22:32:05 non-exclusive backup cleanup is mildly broken