Re: Why we lost Uber as a user

From: Kevin Grittner <kgrittn(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Alfred Perlstein <alfred(at)freebsd(dot)org>, Geoff Winkless <pgsqladmin(at)geoff(dot)dj>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why we lost Uber as a user
Date: 2016-08-03 20:51:25
Message-ID: CACjxUsMmXzurHeHgfW8PtLAy37zyGJXihX1c7gJxO+WFnNfZWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 3, 2016 at 2:15 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:
>> On 08/03/2016 11:23 AM, Tom Lane wrote:
>>> I think the realistic answer if you suffer replication-induced corruption
>>> is usually going to be "re-clone that slave", and logical rep doesn't
>>> really offer much gain in that.
>
>> Yes, it actually does. The ability to unsubscribe a set of tables,
>> truncate them and then resubscribe them is vastly superior to having to
>> take a base backup.
>
> True, *if* you can circumscribe the corruption to a relatively small
> part of your database, logical rep might provide more support for a
> partial re-clone.

When I worked with Wisconsin Courts to migrate their databases to
PostgreSQL, we had a DBMS-agnostic logical replication system, and
we had a compare program that could be run off-hours as well as
having that be a background activity for the replication software
to work on during idle time. Either way. a range of rows based on
primary key was read on each side and hashed, the hashes compared,
and if they didn't match there was a column-by-column compare for
each row in the range, with differences listed. This is how we
discovered issues like the non-standard handling of backslash
mangling our data.

Personally, I can't imagine running logical replication of
supposedly matching sets of data without something equivalent.

Certainly, the courts had source documents to use for resolving any
question of the correct value on a mismatch, and I would imagine
that many environments would. If you have a meaningful primary key
(like a court case number, by which the file folder is physically
located), seeing the different values for a specific column in a
specific row makes fixes pretty straightforward.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-08-03 20:52:34 pgsql: Prevent "snapshot too old" from trying to return pruned TOAST tu
Previous Message Andres Freund 2016-08-03 20:07:49 Re: Optimizing numeric SUM() aggregate