Broadcast replication (Was Re: 7.4 Wishlist)

From: "Al Sutton" <al(at)alsutton(dot)com>
To: "Kevin Brown" <kevin(at)sysexperts(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Cc: <pgreplication-general(at)gborg(dot)postgresql(dot)org>
Subject: Broadcast replication (Was Re: 7.4 Wishlist)
Date: 2002-12-04 09:47:16
Message-ID: 017601c29b7a$21db49d0$0100a8c0@cloud
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-advocacy pgsql-general pgsql-hackers


----- Original Message -----
From: "Kevin Brown" <kevin(at)sysexperts(dot)com>
To: <pgsql-hackers(at)postgresql(dot)org>
Sent: Tuesday, December 03, 2002 8:49 PM
Subject: [mail] Re: [HACKERS] 7.4 Wishlist

> Al Sutton wrote:
> > Point to Point and Broadcast replication
> > ----------------------------------------
> > With point to point you specify multiple endpoints, with broadcast you
can
> > specify a subnet address and the updates are broadcast over that subnet.
> >
> > The difference being that point to point works well for cross network
> > replication, or where you have a few replicants. I have multiple
database
> > servers which could have a deadicated class C network that they are all
on,
> > by broadcasting updates you can cutdown the amount of traffic on that
net by
> > a factor of n minus 1 (where n is the number of servers involved).
>
> Yech. Now you can't use TCP anymore, so the underlying replication
> code has to handle all the issues that TCP deals with transparently,
> like error checking, retransmits, data windows, etc. I don't think
> it's wise to assume that your transport layer is 100% reliable.
>
> Further, this doesn't even address the problem of bringing up a leaf
> server that's been down a while. It can be significantly out of date
> relative to the other servers on the subnet.
>
> I suspect you'll be better off implementing a replication protocol
> that has the leaf nodes keeping each other up to date, to minimize the
> traffic coming from the next level up. Then you can use TCP for the
> connections but minimize the traffic generated by any given node.
>

I wasn't saying that ALL replication traffic must be broadcast, if a
specific server needs a refresh when it comes then point to point is fine
because only one machine needs the data, and thus broadcasting it to all
would load machines with data they didn't need.

The aim of using broadcast is to cut down the ongoing traffic, say, for
example, I have a cluster of ten database servers I can connect them onto a
dedicated LAN shared only by database servers and I would see 10% of the
traffic I would get if I were using point to point (this is assuming that
the addition of error checking, retransmits, etc. to the broadcast protocol
adds a similiar overhead per packet as TCP point to point).

If others wish to know more about this I can prepare an overview for how I
see it working.

[Other points snipped]

In response to

Browse pgsql-advocacy by date

  From Date Subject
Next Message Justin Clift 2002-12-04 10:58:37 Re: [GENERAL] PostgreSQL Global Development Group Announces
Previous Message Frank van Vugt 2002-12-04 09:01:23 Re: Segmentation fault in 7.3

Browse pgsql-general by date

  From Date Subject
Next Message Jean-Christian Imbeault 2002-12-04 10:19:02 Re: pg and number of parameters by insert
Previous Message Richard Huxton 2002-12-04 09:46:41 Re: select refcursor into a variable

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Clift 2002-12-04 10:58:37 Re: [GENERAL] PostgreSQL Global Development Group Announces
Previous Message Jeroen T. Vermeulen 2002-12-04 09:33:14 PQnotifies() in 7.3 broken?