Re: Logical replication and multimaster

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Logical replication and multimaster
Date: 2015-12-03 12:39:30
Message-ID: CANP8+jLmPXKxDsqXwf6RNSp9Q-6uc7UcZo=KNfCzH0PeehhE=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 30 November 2015 at 17:20, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru
> wrote:

> But looks like there is not so much sense in having multiple network
> connection between one pair of nodes.
> It seems to be better to have one connection between nodes, but provide
> parallel execution of received transactions at destination side. But it
> seems to be also nontrivial. We have now in PostgreSQL some infrastructure
> for background works, but there is still no abstraction of workers pool and
> job queue which can provide simple way to organize parallel execution of
> some jobs. I wonder if somebody is working now on it or we should try to
> propose our solution?
>

There are definitely two clear places where additional help would be useful
and welcome right now.

1. Allowing logical decoding to have a "speculative pre-commit data"
option, to allow some data to be made available via the decoding api,
allowing data to be transferred prior to commit. This would allow us to
reduce the delay that occurs at commit, especially for larger transactions
or very low latency requirements for smaller transactions. Some heuristic
or user interface would be required to decide whether to and which
transactions might make their data available prior to commit. And we would
need to send abort messages should the transactions not commit as expected.
That would be a patch on logical decoding and is an essentially separate
feature to anything currently being developed.

2. Some mechanism/theory to decide when/if to allow parallel apply. That
could be used for both physical and logical replication. Since the apply
side of logical replication is still being worked on there is a code
dependency there, so a working solution isn't what is needed yet. But the
general principles and any changes to the data content (wal_level) or
protocol (pglogical_output) would be useful.

We already have working multi-master that has been contributed to PGDG, so
contributing that won't gain us anything. There is a lot of code and
pglogical is the most useful piece of code to be carved off and reworked
for submission. The bottleneck is review and commit, not initial
development - which applies both to this area and most others in PostgreSQL.

Having a single network connection between nodes would increase efficiency
but also increase replication latency, so its not useful in all cases.

I think having some kind of message queue between nodes would also help,
since there are many cases for which we want to transfer data, not just a
replication data flow. For example, consensus on DDL, or MPP query traffic.
But that is open to wider debate.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2015-12-03 13:32:28 Re: Logical replication and multimaster
Previous Message Ashutosh Bapat 2015-12-03 12:26:02 Re: Confusing results with lateral references