Quick Links

Re: [HACKERS] Replication documentation addition

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Richard Troy <rtroy(at)ScienceTools(dot)com>
Cc:	Hannu Krosing <hannu(at)skype(dot)net>, PostgreSQL-documentation <pgsql-docs(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Replication documentation addition
Date:	2006-10-25 19:31:09
Message-ID:	200610251931.k9PJV9416962@momjian.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-docs pgsql-hackers

Richard Troy wrote:
>
> > Here is a new replication documentation section I want to add for 8.2:
> >
> > ftp://momjian.us/pub/postgresql/mypatches/replication
> >
>
> ...Read the document, as promissed...
>
> First paragraph, "(fail over)" is inconsistent with title, "failover", as
> are other spots throughout the document. The whole document should be
> consistent and I vote for "failover" and not "fail over."

OK. Fixed to "failover"

> Fourth paragraph, "This "sync problem" is the fundamental difficulty for
> servers working together"; "Sync problem" hasn't been defined. Actually,
> you're talking about the consistent attribute of the "acid" properties of
> all competent databases: Atomic, Consistency, Isolation, and Durability.
> At least define the term you are using - probably most easily done in the
> preceeding paragraph.

OK, "sync problem" term removed, and spelled out fully.

> The fifth paragraph needs a lot more help, I think. Howabout this
> alternative:
>
> So called "two phaised commit" was developed as a strategy in which two or
> more databases are updated simultaneously and none of the data is
> committed until all are committed. This guarantees consistency between the
> databases with all propagation delay being absorbed by the writer at write
> time. There are times when this propagation delay is large, so sometimes
> alternatives are worked out which we'll call here "asynchronous updates,"
> however, in these cases, there is always a window of time in which some
> transaction can be lost should a failure occurr. For this reason,
> asynchronous updates are only used when the possibility of such losses is
> acceptible.

I have modified the paragraph to use some of your terms.

> Paragraphs six through to "shared disk failover" seem very awkward to me.
> I don't like them at all.
>
> "Shared disk failover" has nothing to do with "the sync problem" as it's
> not a multiple-database solution. It's an uptime, "24 X 7 X 365" issue.
> Further, it also has nothing to do with disk arrays, though it is often
> used with RAID to help avoid disk based corruption problems.

Yes, please see updated version. I removed the sync problem term from
there.

> The point about Warm Standby needs to include a warning about WAL that it
> MUST be sensitive to the semantics of the database design or else it's
> fatally flawed. I'm talking about "referential integrety". That is to say,
> it's inappropriate to capture updates on a table by table basis, as some
> such systems do, (I have no idea what's done by anyone in the PG world on
> this right now) because an update to one table (esp. inserts) very often
> go hand in glove with updates in other tables and to get one without the
> other can corrupt a database.

We don't have that problem. We recover only full transactions.

> The description of "Continuously running replication server" should
> include the critical caveat - repeated if you think it's already said
> elsewhere - that it is ONLY suitable for applications in which a loss of
> (missing) update data doesn't matter. For example, an airline reservation
> system would be an inappropriate application for such a "solution" because
> what seats are available cannot be guaranteed to be correct.

I have added note about data loss for the Slony item.

> Regarding data partitioning, I strongly disagree with the opening sentence
> in that it doesn't split a database into sets, it splits tables into sets.

OK, changed.

> Data partitioning is often done within a single database on a single
> server and therefore, as a concept, has nothing whatsoever to do with
> different servers. Similarly, the second paragraph of this section is

Uh, why would someone split things up like that on a single server?

> problematic. Please define your term first, then talk about some
> implementations - this is muddying the water. Further, there are both
> vertical and horizontal partitioning - you mention neither - and each has
> its own distinct uses. If partitioning is mentioned, it should be more
> complete.

Uh, what exactly needs to be defined.

> Next, Query Broadcast Load Balancing... also needs a lot of work. First,
> it's foremost in my memory that sending read queries everywhere and
> returning the first result set back is a key way to improve application
> performance at the cost of additional load on other systems - I guess
> that's not at all what the document is after here, but it's a worthy part
> of a dialogue on broadcasting queries. In other words, this has more parts
> to it than just what the document now entertains. Secondly, the document

Uh, do we want to go into that here? I guess I could.

> doesn't address _at_all_ whether this is a two-phaise-commit environment
> or not. If not, how are updates managed? If each server operates
> independently and one of them fails, what do you do then? How do you know
> _any_ server got an insert/update? ... Each server _can't_ operate
> independently unless the application does its own insert/update commits to
> every one of them - and that can't be fast, nor does it load balance,
> though it may contribute to superior uptime performance by the
> application.

I think having the application middle layer do the commits is how it
works now. Can someone explain how pgpool works, or should we mention
how two-phase commit has to be done here? pgpool2 has additional
features.

> Next up; I'm not aware of any current products or projects that provide
> parallel query execution, though Informix might - I can ask a colleague or
> two. Either way, it's probably best to simply define the term (perhaps in
> a little more detail), and not mention solutions - they change with time
> anyway.

Actually, Bizgres MPP, based on PostgreSQL, does this, but mostly for
read-only queries.

> While I've never used Oracle's clustering tools, I've read up on them and
> have customers who use them, and I think this description of Oracle
> clustering is a mis-read on what the Oracle system actually does. A check
> with a true Oracle clustering expert is in order here.

OK, would someone please comment?

> Hope this helps. If asked, I'm willing to (re)write some of the bits
> discussed above.

Yes, please review the URL and let me know what else to change. Thanks.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Re: Replication documentation addition at 2006-10-25 18:40:22 from Richard Troy

Responses

Re: Replication documentation addition at 2006-10-27 19:57:34 from Richard Troy

Browse pgsql-docs by date

	From	Date	Subject
Next Message	Bruce Momjian	2006-10-25 19:32:31	Re: [HACKERS] Replication documentation addition
Previous Message	Josh Berkus	2006-10-25 18:59:28	Re: [HACKERS] Replication documentation addition

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2006-10-25 19:32:31	Re: [HACKERS] Replication documentation addition
Previous Message	Alvaro Herrera	2006-10-25 19:20:49	Re: Out of memory error causes Abort, Abort tries to allocate memory