Re: Core team statement on replication in PostgreSQL

From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Josh Berkus" <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "David Fetter" <david(at)fetter(dot)org>, "Marko Kreen" <markokr(at)gmail(dot)com>
Subject: Re: Core team statement on replication in PostgreSQL
Date: 2008-05-30 02:59:21
Message-ID: b42b73150805291959o7856c27cm99617f2edba9eb1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-advocacy pgsql-hackers

On Thu, May 29, 2008 at 9:26 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> I fully accept that it may be the case that it doesn't make technical
>> sense to tackle them in any order besides sync->read-only slaves because
>> of dependencies in the implementation between the two. If that's the
>> case, it would be nice to explicitly spell out what that was to deflect
>> criticism of the planned prioritization.
>
> There's a very simple reason to prioritize the synchronous log shipping first;
> NTT may open source their solution and we'll get it a lot sooner than the
> other components.

That's a good argument. I just read the NTT document and the stuff
looks fantastic. You've convinced me...it just doesn't seem prudent
to forge ahead with hot standby without dealing with all the
syncnronous changes to wal logging first. I just want you guys to
understand how important hot standby is to a lot of people. sync
logging maybe less so, but having a proof of concept implementation
significantly alters the bang/buck ratio.

> That is, we expect that synch log shipping is *easier* than read-only slaves
> and will get done sooner. Since there are quite a number of users who could
> use this, whether or not they can run queries on the slaves, why not ship
> that feature as soon as its done?

I think what dfetter, etc. were saying is that we should elevate the
hot standby stuff to a requirement, or at least a future requirement.
IOW, we should try and avoid doing anything which would make it harder
than it already is. Please understand that I don't thing people on
the list were trying to be negative...the failure of hot standby to
materialize in the 8.3 cycle was a bitter pill for many people. I
personally see this new thinking as a hugely positive development.

> There's also a number of issues with using the currently log shipping method
> for replication. In additon to the previously mentioned setup pains, there's
> the 16MB chunk size for shipping log segments, which is fine for data
> warehouses but kind of sucks for a web application with a 3GB database which
> may take 2 hours to go though 16MB. So we have to change the shipping method
> anyway, and if we're doing that, why not work on synch?

well, there is the archive_timeout setting...but point taken. A big
use case for hot standby is OLTP environments where you get to combine
HA and reporting server into a single box.

> Mind you, if someone wanted to get started on read-only slaves *right now* I
> can't imagine anyone would object. There's a number of problems to solve
> with recovery mode, table locking etc. that can use some work even before we
> deal with changes to log shipping, or XID writeback or any of the other
> issues. So, volunteers?

As I see it, sync logging, hot standby, and improved setup features
are all mostly orthogonal.. Florian took some pretty decent notes
during his analysis and outlined the problem areas pretty well. That
would be a starting point. It just strikes me for all this stuff to
having even remote chance of making 8.4 the work needs to be divided
up into teams and conquered separately.

merlin

In response to

Responses

Browse pgsql-advocacy by date

  From Date Subject
Next Message Andrew Dunstan 2008-05-30 03:02:56 Re: Core team statement on replication in PostgreSQL
Previous Message Andrew Sullivan 2008-05-30 02:38:28 Re: Core team statement on replication in PostgreSQL

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2008-05-30 03:02:56 Re: Core team statement on replication in PostgreSQL
Previous Message Andrew Sullivan 2008-05-30 02:38:28 Re: Core team statement on replication in PostgreSQL