Re: Core team statement on replication in PostgreSQL

From: Robert Hodges <robert(dot)hodges(at)continuent(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Fetter <david(at)fetter(dot)org>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Core team statement on replication in PostgreSQL
Date: 2008-05-29 19:05:18
Message-ID: C464BCFE.7E90%robert.hodges@continuent.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-advocacy pgsql-hackers

Hi everyone,

First of all, I'm absolutely delighted that the PG community is thinking seriously about replication.

Second, having a solid, easy-to-use database availability solution that works more or less out of the box would be an enormous benefit to customers. Availability is the single biggest problem for customers in my experience and as other people have commented the alternatives are not nice. It's an excellent idea to build off an existing feature-PITR is already pretty useful and the proposed features are solid next steps. The fact that it does not solve all problems is not a drawback but means it's likely to get done in a reasonable timeframe.

Third, you can't stop with just this feature. (This is the BUT part of the post.) The use cases not covered by this feature area actually pretty large. Here are a few that concern me:

1.) Partial replication.
2.) WAN replication.
3.) Bi-directional replication. (Yes, this is evil but there are problems where it is indispensable.)
4.) Upgrade support. Aside from database upgrade (how would this ever really work between versions?), it would not support zero-downtime app upgrades, which depend on bi-directional replication tricks.
5.) Heterogeneous replication.
6.) Finally, performance scaling using scale-out over large numbers of replicas. I think it's possible to get tunnel vision on this-it's not a big requirement in the PG community because people don't use PG in the first place when they want to do this. They use MySQL, which has very good replication for performance scaling, though it's rather weak for availability.

As a consequence, I don't see how you can get around doing some sort of row-based replication like all the other databases. Now that people are starting to get religion on this issue I would strongly advocate a parallel effort to put in a change-set extraction API that would allow construction of comprehensive master/slave replication. (Another approach would be to make it possible for third party apps to read the logs and regenerate SQL.) There are existing models for how to do change set extraction; we have done it several times at my company already. There are also research projects like GORDA that have looked fairly comprehensively at this problem.

My company would be quite happy to participate in or even sponsor such an API. Between the proposed WAL-based approach and change-set-based replication it's not hard to see PG becoming the open source database of choice for a very large number of users.

Cheers, Robert

On 5/29/08 6:37 PM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

David Fetter <david(at)fetter(dot)org> writes:
> On Thu, May 29, 2008 at 08:46:22AM -0700, Joshua D. Drake wrote:
>> The only question I have is... what does this give us that PITR
>> doesn't give us?

> It looks like a wrapper for PITR to me, so the gain would be ease of
> use.

A couple of points about that:

* Yeah, ease of use is a huge concern here. We're getting beat up
because people have to go find a separate package (and figure out
which one they want), install it, learn how to use it, etc. It doesn't
help that the most mature package is Slony which is, um, not very
novice-friendly or low-admin-complexity. I personally got religion
on this about two months ago when Red Hat switched their bugzilla
from Postgres to MySQL because the admins didn't want to deal with Slony
any more. People want simple.

* The proposed approach is trying to get to "real" replication
incrementally. Getting rid of the loss window involved in file-by-file
log shipping is step one, and I suspect that step two is going to be
fixing performance issues in WAL replay to ensure that slaves can keep
up. After that we'd start thinking about how to let slaves run
read-only queries. But even without read-only queries, this will be
a useful improvement for HA/backup scenarios.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Robert Hodges, CTO, Continuent, Inc.
Email: robert(dot)hodges(at)continuent(dot)com
Mobile: +1-510-501-3728 Skype: hodgesrm

In response to

Responses

Browse pgsql-advocacy by date

  From Date Subject
Next Message Merlin Moncure 2008-05-29 19:16:57 Re: Core team statement on replication in PostgreSQL
Previous Message Merlin Moncure 2008-05-29 19:03:40 Re: Core team statement on replication in PostgreSQL

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2008-05-29 19:10:50 Re: [PERFORM] Memory question on win32 systems
Previous Message Merlin Moncure 2008-05-29 19:03:40 Re: Core team statement on replication in PostgreSQL