Re: Future In-Core Replication

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Future In-Core Replication
Date: 2012-04-30 19:38:56
Message-ID: CA+Tgmob-R6mKASJfDDnm6LZP8qTCiDpPNxSG-NRapq+iNvWeRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 30, 2012 at 2:33 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Mon, Apr 30, 2012 at 12:38 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> For example, you said that "MM replication alone is not a solution for
>> large data or the general case".  Why is that?  Is the goal of your work
>> really to do logical replciation, which allows for major version
>> upgrades?  Is that the defining feature?
>
> TBH, I don't think MM replication belongs in the database at all.
> Ditto any replication solution that implements 'eventual consistency'
> such that after the fact conflict resolution is required.  In an SQL
> database, when a transaction commits, it should remain so.  It belongs
> in the application layer.

I basically agree, at least in the medium term. The logical
replication solutions we have today generally seem to work by watching
the inserts, updates, and deletes go by and writing the changed tuples
to a side table. This is not very performant, because it amounts to
writing the data four times: we have to write WAL for the original
change, write the data files for the original change, write more WAL
for the change records, and the write those data files. Since all
large database solutions are eventually I/O-bound, this is not great.
Writing and flushing a separate replication log in parallel to WAL
would get us down to three writes, and extracting tuple data from the
existing WAL would get us down to two writes, which is as well as we
ever know how to do.

If we just had that much in core - that is, the ability to efficiently
extra tuple inserts, updates, and deletes on a logical level - it
would be much easier to build a good logical replication system around
PostgreSQL than it is today, and the existing systems could be adapted
to deliver higher performance by making use of the new infrastructure.
The other half of the changes - applying the updates - is relatively
straightforward, and it wouldn't bother me to leave that in user-land,
especially in the MMR case, where you have to deal with conflict
resolution rules that may be much simpler to express in a higher-level
language than they would be in C.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-04-30 19:39:43 Re: precision and scale functions for numeric
Previous Message David Johnston 2012-04-30 19:33:48 Re: precision and scale functions for numeric