Re: Replication documentation addition

From: Richard Troy <rtroy(at)ScienceTools(dot)com>
To: Hannu Krosing <hannu(at)skype(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-documentation <pgsql-docs(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replication documentation addition
Date: 2006-10-25 18:40:22
Message-ID: Pine.LNX.4.33.0610251043410.30114-100000@denzel.in
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers


> Here is a new replication documentation section I want to add for 8.2:
>
> ftp://momjian.us/pub/postgresql/mypatches/replication
>

...Read the document, as promissed...

First paragraph, "(fail over)" is inconsistent with title, "failover", as
are other spots throughout the document. The whole document should be
consistent and I vote for "failover" and not "fail over."

Fourth paragraph, "This "sync problem" is the fundamental difficulty for
servers working together"; "Sync problem" hasn't been defined. Actually,
you're talking about the consistent attribute of the "acid" properties of
all competent databases: Atomic, Consistency, Isolation, and Durability.
At least define the term you are using - probably most easily done in the
preceeding paragraph.

The fifth paragraph needs a lot more help, I think. Howabout this
alternative:

So called "two phaised commit" was developed as a strategy in which two or
more databases are updated simultaneously and none of the data is
committed until all are committed. This guarantees consistency between the
databases with all propagation delay being absorbed by the writer at write
time. There are times when this propagation delay is large, so sometimes
alternatives are worked out which we'll call here "asynchronous updates,"
however, in these cases, there is always a window of time in which some
transaction can be lost should a failure occurr. For this reason,
asynchronous updates are only used when the possibility of such losses is
acceptible.

Paragraphs six through to "shared disk failover" seem very awkward to me.
I don't like them at all.

"Shared disk failover" has nothing to do with "the sync problem" as it's
not a multiple-database solution. It's an uptime, "24 X 7 X 365" issue.
Further, it also has nothing to do with disk arrays, though it is often
used with RAID to help avoid disk based corruption problems.

The point about Warm Standby needs to include a warning about WAL that it
MUST be sensitive to the semantics of the database design or else it's
fatally flawed. I'm talking about "referential integrety". That is to say,
it's inappropriate to capture updates on a table by table basis, as some
such systems do, (I have no idea what's done by anyone in the PG world on
this right now) because an update to one table (esp. inserts) very often
go hand in glove with updates in other tables and to get one without the
other can corrupt a database.

The description of "Continuously running replication server" should
include the critical caveat - repeated if you think it's already said
elsewhere - that it is ONLY suitable for applications in which a loss of
(missing) update data doesn't matter. For example, an airline reservation
system would be an inappropriate application for such a "solution" because
what seats are available cannot be guaranteed to be correct.

Regarding data partitioning, I strongly disagree with the opening sentence
in that it doesn't split a database into sets, it splits tables into sets.
Data partitioning is often done within a single database on a single
server and therefore, as a concept, has nothing whatsoever to do with
different servers. Similarly, the second paragraph of this section is
problematic. Please define your term first, then talk about some
implementations - this is muddying the water. Further, there are both
vertical and horizontal partitioning - you mention neither - and each has
its own distinct uses. If partitioning is mentioned, it should be more
complete.

Next, Query Broadcast Load Balancing... also needs a lot of work. First,
it's foremost in my memory that sending read queries everywhere and
returning the first result set back is a key way to improve application
performance at the cost of additional load on other systems - I guess
that's not at all what the document is after here, but it's a worthy part
of a dialogue on broadcasting queries. In other words, this has more parts
to it than just what the document now entertains. Secondly, the document
doesn't address _at_all_ whether this is a two-phaise-commit environment
or not. If not, how are updates managed? If each server operates
independently and one of them fails, what do you do then? How do you know
_any_ server got an insert/update? ... Each server _can't_ operate
independently unless the application does its own insert/update commits to
every one of them - and that can't be fast, nor does it load balance,
though it may contribute to superior uptime performance by the
application.

Next up; I'm not aware of any current products or projects that provide
parallel query execution, though Informix might - I can ask a colleague or
two. Either way, it's probably best to simply define the term (perhaps in
a little more detail), and not mention solutions - they change with time
anyway.

While I've never used Oracle's clustering tools, I've read up on them and
have customers who use them, and I think this description of Oracle
clustering is a mis-read on what the Oracle system actually does. A check
with a true Oracle clustering expert is in order here.

Hope this helps. If asked, I'm willing to (re)write some of the bits
discussed above.

Regards,
Richard

--
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
rtroy(at)ScienceTools(dot)com, http://ScienceTools.com/

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Bruce Momjian 2006-10-25 18:41:02 Re: [HACKERS] Replication documentation addition
Previous Message Alexey Klyukin 2006-10-25 18:33:43 Re: Replication documentation addition

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2006-10-25 18:41:02 Re: [HACKERS] Replication documentation addition
Previous Message Alexey Klyukin 2006-10-25 18:33:43 Re: Replication documentation addition