Quick Links

Re: Future In-Core Replication

From:	Merlin Moncure <mmoncure(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Future In-Core Replication
Date:	2012-04-30 21:35:04
Message-ID:	CAHyXU0ywyWd7QmZa0kLFcE+y=Wf_LEbKq5Bhm++fP3z8D=4Y1w@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Apr 30, 2012 at 2:38 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Apr 30, 2012 at 2:33 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> On Mon, Apr 30, 2012 at 12:38 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>>> For example, you said that "MM replication alone is not a solution for
>>> large data or the general case". Why is that? Is the goal of your work
>>> really to do logical replciation, which allows for major version
>>> upgrades? Is that the defining feature?
>>
>> TBH, I don't think MM replication belongs in the database at all.
>> Ditto any replication solution that implements 'eventual consistency'
>> such that after the fact conflict resolution is required. In an SQL
>> database, when a transaction commits, it should remain so. It belongs
>> in the application layer.
>
> I basically agree, at least in the medium term. The logical
> replication solutions we have today generally seem to work by watching
> the inserts, updates, and deletes go by and writing the changed tuples
> to a side table. This is not very performant, because it amounts to
> writing the data four times: we have to write WAL for the original
> change, write the data files for the original change, write more WAL
> for the change records, and the write those data files. Since all
> large database solutions are eventually I/O-bound, this is not great.
> Writing and flushing a separate replication log in parallel to WAL
> would get us down to three writes, and extracting tuple data from the
> existing WAL would get us down to two writes, which is as well as we
> ever know how to do.
>
> If we just had that much in core - that is, the ability to efficiently
> extra tuple inserts, updates, and deletes on a logical level - it
> would be much easier to build a good logical replication system around
> PostgreSQL than it is today, and the existing systems could be adapted
> to deliver higher performance by making use of the new infrastructure.
> The other half of the changes - applying the updates - is relatively
> straightforward, and it wouldn't bother me to leave that in user-land,
> especially in the MMR case, where you have to deal with conflict
> resolution rules that may be much simpler to express in a higher-level
> language than they would be in C.

Yeah -- here at $work the SQL Server team (once in a while we cross
no-man's land and converse) has some fancy technology that sits
directly on top of the transaction log and exposes an API that you can
use to peek into the river of data running through the log and do
stuff with it. In our case, they use it to triage extracts from about
100 or so distributed databases into a centralized store in a
relatively realtime fashion. HS/SR simply can't do that and there
would be tremendous value in something that could.

merlin

In response to

Re: Future In-Core Replication at 2012-04-30 19:38:56 from Robert Haas

Responses

Re: Future In-Core Replication at 2012-04-30 22:43:46 from Josh Berkus

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Kevin Grittner	2012-04-30 21:48:31	Re: Patch: add conversion from pg_wchar to multibyte
Previous Message	Kevin Grittner	2012-04-30 20:35:02	Re: Future In-Core Replication