Re: Why we lost Uber as a user

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alfred Perlstein <alfred(at)freebsd(dot)org>, Geoff Winkless <pgsqladmin(at)geoff(dot)dj>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why we lost Uber as a user
Date: 2016-08-02 20:12:20
Message-ID: 20160802201220.GW4028@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Tue, Aug 2, 2016 at 3:07 PM, Alfred Perlstein <alfred(at)freebsd(dot)org> wrote:
> > You are quite technical, my feeling is that you will understand it, however it will need to be a self learned lesson.
>
> I don't know what this is supposed to mean, but I think that Geoff's
> point is somewhat valid. No matter how you replicate data, there is
> always the possibility that you will replicate any corruption along
> with the data - or that your copy will be unfaithful to the original.

I believe what Geoff was specifically getting at is probably best
demonstrated with an example.

Consider a bug in the btree index code which will accept a value but not
store it correctly.

INSERT INTO mytable (indexed_column) VALUES (-1000000000);

/* oops, bug, this value gets stored in the wrong place in the btree */

We happily accept the record and insert it into the btree index, but
that insert is incorrect and results in the btree being corrupted
because some bug doesn't handle such large values correctly.

In such a case, either approach to replication (replicating the query
statement, or replicating the changes to the btree page exactly) would
result in corruption on the replica.

The above represents a bug in *just* the btree side of things (the
physical replication did its job correctly, even though the result is a
corrupted index on the replica).

With physical replication, there is the concern that a bug in *just* the
physical (WAL) side of things could cause corruption. That is, we
correctly accept and store the value on the primary, but the records
generated to send that data to the replica are incorrect and result in
an invalid state on the replica.

Of course, a bug in the physical side of things which caused corruption
would mean that *crash recovery* would also cause corruption. As I
understand it, that same concern exists for MySQL, so, moving to logical
replication doesn't actually mean you don't need to worry about bugs in
the crash recovery side of things, assuming you depend on the database
to come back up in a consistent manner after a crash.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-08-02 20:42:38 Re: HandleParallelMessages contains CHECK_FOR_INTERRUPTS?
Previous Message Bruce Momjian 2016-08-02 19:58:09 Re: No longer possible to query catalogs for index capabilities?