Re: Missing rows after migrating from postgres 11 to 12 with logical replication

From: Lars Vonk <lars(dot)vonk(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Missing rows after migrating from postgres 11 to 12 with logical replication
Date: 2020-12-20 16:33:24
Message-ID: CAMX1Thi3iHzLbza68nxfpiHs2Rw-xN-fXfbRU3FXg3OBXUnZyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

Just wondering if someone knows how this could have happened? Did we miss
out on something when setting up the logical replication? Are there any
scenario's in which this could happen (like database restart or anything
else?).
Or should I report this a bug (although I can't image it is)?
We really would like to know how we can prevent this from happening the
next time.

We still have the old primary, and a snapshot of the current primary around
the time we flipped from the old to the new. So we could some digging into
the cause, but we don't know what to look for...

Any help or tips are appreciated.

Thanks in advance,

Lars

On Fri, Dec 18, 2020 at 4:42 PM Lars Vonk <lars(dot)vonk(at)gmail(dot)com> wrote:

> Hi,
>
> We migrated from postgres 11 to 12 using logical replication (over local
> network). Today we noticed that one table is missing 1252 rows after the
> replication finished and we flipped to the new primary (we still have the
> old master database so we can recover).
>
> We see that these rows were inserted in the table after starting the
> initial copy of the table. Most of the missing rows seem from new inserts
> happening **during the initial copy** (1230) and the rest (22) from inserts
> **during the period the replication ran** (7 days).
>
> After further investigation unfortunately more tables have missing rows,
> all of them are after the initial table copy phase. We took a per-table
> approach for the replication, starting with creating an empty publication
> and adding tables via
>
> ALTER PUBLICATION pg12_migration ADD TABLE FOO
>
> After that we refreshed the publication on the "new postgres 12 primary"
> using
>
> ALTER SUBSCRIPTION pg12_migration REFRESH PUBLICATION;
>
> We only added new tables after the the initial copy of the previous was
> done (the internal state was replicating).
>
> We never stopped the subscriptions during all this and we started with a
> fresh schema.
>
> We did some sanity checks before we switched to the new master, like
> comparing max(id) to see if the replica was up to date (including this
> table) and counts on some smaller tables and that all checked out okay, we
> never thought of missing rows somewhere in between....
>
> So how can this happen?
>
> Lars
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2020-12-20 17:58:09 Re: Missing rows after migrating from postgres 11 to 12 with logical replication
Previous Message Noah Misch 2020-12-20 04:13:19 Re: [PATCH] Logical decoding of TRUNCATE