Quick Links

Re: Design for In-Core Logical Replication

From:	Rod Taylor <rod(dot)taylor(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Design for In-Core Logical Replication
Date:	2016-07-20 16:52:33
Message-ID:	CAKddOFCdzL=qVNUS628kgeVdWKJd9Udf4bRpit825rZeJgNVXQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Jul 20, 2016 at 4:08 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

>
> <para>
> And on Subscriber database:
> <programlisting>
> CREATE SUBSCRIPTION mysub WITH CONNECTION <quote>dbname=foo host=bar
> user=repuser</quote> PUBLICATION mypub;
> </programlisting>
> </para>
> <para>
> The above will start the replication process which synchronizes the
> initial table contents of <literal>users</literal> and
> <literal>departments</literal> tables and then starts replicating
> incremental changes to those tables.
> </para>
> </sect1>
> </chapter>
>

I think it's important for communication channels to be defined separately
from the subscriptions.

If I have nodes 1/2 + 3/4 which operate in pairs, I don't really want to
have to have a script reconfigure replication on 3/4 every-time we do
maintenance on 1 or 2.

3/4 need to know they subscribe to mypub and that they have connections to
machine 1 and machine 2. The replication system should be able to figure
out which (of 1/2) has the most recently available data.

So, I'd rather have:

CREATE CONNECTION machine1;
CREATE CONNECTION machine2;
CREATE SUBSCRIPTION TO PUBLICATION mypub;

Notice I explicitly did not tell it how to get the publication but if we
did have a preference the DNS weighting model might be appropriate.

I'm not certain the subscription needs to be named. IMO, a publication
should have the same properties on all nodes (so any node may become the
primary source). If a subscriber needs different behaviour for a
publication, it should be created as a different publication.

Documenting that ThisPub is different from ThatPub is easier than
documenting that ThisPub on node 1/2/4 is different from ThisPub on node
7/8, except Node 7 is temporarily on Node 4 too (database X instead of
database Y) due to that power problem.

Clearly this is advanced. An initial implementation may only allow mypub
from a single connection.

I also suspect multiple publications will be normal even if only 2 nodes.
Old slow moving data almost always got different treatment than fast-moving
data; even if only defining which set needs to hit the other node first and
which set can trickle through later.

regards,

Rod Taylor

In response to

Design for In-Core Logical Replication at 2016-07-20 08:08:09 from Simon Riggs

Responses

Re: Design for In-Core Logical Replication at 2016-07-20 17:20:30 from Simon Riggs

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Teodor Sigaev	2016-07-20 16:53:18	Re: One process per session lack of sharing
Previous Message	Tom Lane	2016-07-20 16:45:04	Re: skink's test_decoding failures in 9.4 branch