Re: [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)
Date: 2012-10-15 02:54:20
Message-ID: CA+TgmoY7hUvtXOvbHyLhsmoYf3B3G3XmzJYGO5jBA48DurdpTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 11, 2012 at 3:15 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> IMHO that's a good thing, and I'd hope this new logical replication to live
> outside core as well, as much as possible. But whether or not something is
> in core is just a political decision, not a reason to implement something
> new.
>
> If the only meaningful advantage is reducing the amount of WAL written, I
> can't help thinking that we should just try to address that in the existing
> solutions, even if it seems "easy to solve at a first glance, but a solution
> not using a normal transactional table for its log/queue has to solve a lot
> of problems", as the document says. Sorry to be a naysayer, but I'm pretty
> scared of all the new code and complexity these patches bring into core.

I think what we're really missing at the moment is a decent way of
decoding WAL. There are a decent number of customers who, when
presented with replication system, start by asking whether it's
trigger-based or WAL-based. When you answer that it's trigger-based,
their interest goes... way down. If you tell them the triggers are
written in anything but C, you lose a bunch more points. Sure, some
people's concerns are overblown, but it's hard to escape the
conclusion that a WAL-based solution can be a lot more efficient than
a trigger-based solution, and EnterpriseDB has gotten comments from a
number of people who upgraded to 9.0 or 9.1 to the effect that SR was
way faster than Slony.

I do not personally believe that a WAL decoding solution adequate to
drive logical replication can live outside of core, at least not
unless core exposes a whole lot more interface than we do now, and
probably not even then. Even if it could, I don't see the case for
making every replication solution reinvent that wheel. It's a big
wheel to be reinventing, and everyone needs pretty much the same
thing.

That having been said, I have to agree that the people working on this
project seem to be wearing rose-colored glasses when it comes to the
difficulty of implementing a full-fledged solution in core. I'm right
on board with everything up to the point where we start kicking out a
stream of decoded changes to the user... and that's about it. To pick
on Slony for the moment, as the project that has been around for the
longest and has probably the largest user base (outside of built-in
SR, perhaps), they've got a project that they have been developing for
years and years and years. What have they been doing all that time?
Maybe they are just stupid, but Chris and Jan and Steve don't strike
me that way, so I think the real answer is that they are solving
problems that we haven't even started to think about yet, especially
around control logic: how do you turn it on? how do you turn it off?
how do you handle node failures? how do you handle it when a node
gets behind? We are not going to invent good solutions to all of
those problems between now and January, or even between now and next
January.

> PS. I'd love to see a basic Slony plugin for this, for example, to see how
> much extra code on top of the posted patches you need to write in a plugin
> like that to make it functional. I'm worried that it's a lot..

I agree. I would go so far as to say that if Slony can't integrate
with this work and use it in place of their existing change-capture
facility, that's sufficient grounds for unconditional rejection.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-10-15 03:02:44 Re: Making the planner more tolerant of implicit/explicit casts
Previous Message Robert Haas 2012-10-15 02:24:18 Re: Making the planner more tolerant of implicit/explicit casts