Re: Logical decoding of sequence advances, part II

From: Kevin Grittner <kgrittn(at)gmail(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Logical decoding of sequence advances, part II
Date: 2016-08-23 14:07:50
Message-ID: CACjxUsNxkS=pD-5twvSheqeF3U8nfKT73YP69nMdORUiBqWOCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 23, 2016 at 7:40 AM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> On 23 Aug 2016 20:10, "Kevin Grittner" <kgrittn(at)gmail(dot)com> wrote:
>>
>> On Mon, Aug 22, 2016 at 6:39 PM, Craig Ringer <craig(at)2ndquadrant(dot)com>

>>> Could you provide an example of a case where xacts replayed in
>>> commit order will produce incorrect results?
>>
>> https://wiki.postgresql.org/wiki/SSI#Deposit_Report
>>
>> ... where T3 is on the replication target.
>
> Right. But we don't attempt to replicate locking let alone SSI state. As I
> said this is expected. If T1, T2 and T3 run in the master in either READ
> COMMITTED or SERIALIZABLE we will correctly replay whatever got committed
> and leave the replica in the same state as the master.

Eventually. Between the commit of T3 and T2 a state can be seen on
the replica which would not have been allowed on the source.

> It is row level replication so there is no simple way to detect this
> anomaly.

That is probably true, but there is a way to *prevent* it.

> We would have to send a lot of co-ordination data *in both
> directions*, right?

No. The source has all the information about both commit order and
read-write dependencies, and could produce a stream of transaction
IDs to specify the safe order for applying transactions to prevent
the anomaly from appearing on the replica. In this case the commit
order is T1->T3->T2, but the apparent order of execution (AOoE) is
T1->T2->T3. If the source communicated that to the replica, and
the replica held up application of any changes from T3 until T2 was
committed there would be no chance to read incorrect results. It
would not matter if T2 and T3 were committed on the replica
simultaneously or in AOoE, as long as the work of T3 does not
appear before the work of T2.

The replica does not need to send anything back to the source for
this to work.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-08-23 14:09:11 Re: "Some tests to cover hash_index"
Previous Message Robert Haas 2016-08-23 14:07:36 Re: comment typo lmgr.c