Re: Multi-Master Logical Replication

From: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Multi-Master Logical Replication
Date: 2022-04-29 04:16:44
Message-ID: 797cfbbeae44582d42b36e035865c7f45681de8f.camel@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

В Чт, 28/04/2022 в 17:37 +0530, vignesh C пишет:
> On Thu, Apr 28, 2022 at 4:24 PM Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote:
> > В Чт, 28/04/2022 в 09:49 +1000, Peter Smith пишет:
> >
> > > 1.1 ADVANTAGES OF MMLR
> > >
> > > - Increases write scalability (e.g., all nodes can write arbitrary data).
> >
> > I've never heard how transactional-aware multimaster increases
> > write scalability. More over, usually even non-transactional
> > multimaster doesn't increase write scalability. At the best it
> > doesn't decrease.
> >
> > That is because all hosts have to write all changes anyway. But
> > side cost increases due to increased network interchange and
> > interlocking (for transaction-aware MM) and increased latency.
>
> I agree it won't increase in all cases, but it will be better in a few
> cases when the user works on different geographical regions operating
> on independent schemas in asynchronous mode. Since the write node is
> closer to the geographical zone, the performance will be better in a
> few cases.

From EnterpriseDB BDB page [1]:

> Adding more master nodes to a BDR Group does not result in
> significant write throughput increase when most tables are
> replicated because BDR has to replay all the writes on all nodes.
> Because BDR writes are in general more effective than writes coming
> from Postgres clients via SQL, some performance increase can be
> achieved. Read throughput generally scales linearly with the number
> of nodes.

And I'm sure EnterpriseDB does the best.

> > В Чт, 28/04/2022 в 08:34 +0000, kuroda(dot)hayato(at)fujitsu(dot)com пишет:
> > > Dear Laurenz,
> > >
> > > Thank you for your interest in our works!
> > >
> > > > I am missing a discussion how replication conflicts are handled to
> > > > prevent replication from breaking
> > >
> > > Actually we don't have plans for developing the feature that avoids conflict.
> > > We think that it should be done as core PUB/SUB feature, and
> > > this module will just use that.
> >
> > If you really want to have some proper isolation levels (
> > Read Committed? Repeatable Read?) and/or want to have
> > same data on each "master", there is no easy way. If you
> > think it will be "easy", you are already wrong.
>
> The synchronous_commit and synchronous_standby_names configuration
> parameters will help in getting the same data across the nodes. Can
> you give an example for the scenario where it will be difficult?

So, synchronous or asynchronous?
Synchronous commit on every master, every alive master or on quorum
of masters?

And it is not about synchronicity. It is about determinism at
conflicts.

If you have fully determenistic conflict resolution that works
exactly same way on each host, then it is possible to have same
data on each host. (But it will not be transactional.)And it seems EDB BDB achieved this.

Or if you have fully and correctly implemented one of distributed
transactions protocols.

[1] https://www.enterprisedb.com/docs/bdr/latest/overview/#characterising-bdr-performance

regards

------

Yura Sokolov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-04-29 04:48:59 Re: bogus: logical replication rows/cols combinations
Previous Message Peter Smith 2022-04-29 02:37:46 Re: Handle infinite recursion in logical replication setup