Re: Multi-Master Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Multi-Master Logical Replication
Date: 2022-06-10 09:29:57
Message-ID: CAA4eK1KJz0t6r6TzUs5MHVgcfbMoaD+57gtZ3weA0uWuAxRp9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 10, 2022 at 12:40 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Fri, Jun 10, 2022 at 9:54 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > 1. Are you proposing to use logical replication subscribers to be in
> > > sync quorum? In other words, in an N-masters node, M (M >= N)-node
> > > configuration, will each master be part of the sync quorum in the
> > > other master?
> > >
> >
> > What exactly do you mean by sync quorum here? If you mean to say that
> > each master node will be allowed to wait till the commit happens on
> > all other nodes similar to how our current synchronous_commit and
> > synchronous_standby_names work, then yes, it could be achieved. I
> > think the patch currently doesn't support this but it could be
> > extended to support the same. Basically, one can be allowed to set up
> > async and sync nodes in combination depending on its use case.
>
> Yes, I meant each master node will be in synchronous_commit with
> others. In this setup, do you see any problems such as deadlocks if
> write-txns on the same table occur on all the masters at a time?
>

I have not tried but I don't see in theory why this should happen
unless someone tries to update a similar set of rows in conflicting
order similar to how it can happen in a single node. If so, it will
error out and one of the conflicting transactions needs to be retried.
IOW, I think the behavior should be the same as on a single node. Do
you have any particular examples in mind?

> If the master nodes are not in synchronous_commit i.e. connected in
> asynchronous mode, don't we have data synchronous problems because of
> logical decoding and replication latencies? Say, I do a bulk-insert to
> a table foo on master 1, Imagine there's a latency with which the
> inserted rows get replicated to master 2 and meanwhile I do update on
> the same table foo on master 2 based on the rows inserted in master 1
> - master 2 doesn't have all the inserted rows on master 1 - how does
> the solution proposed here address this problem?
>

I don't think that is possible even in theory and none of the other
n-way replication solutions I have read seems to be claiming to have
something like that. It is quite possible that I am missing something
here but why do we want to have such a requirement from asynchronous
replication? I think in such cases even for load balancing we can
distribute reads where eventually consistent data is acceptable and
writes on separate tables/partitions can be distributed.

I haven't responded to some of your other points as they are
associated with the above theory.

>
> > > 4. Can the design proposed here be implemented as an extension instead
> > > of a core postgres solution?
> > >
> >
> > Yes, I think it could be. I think this proposal introduces some system
> > tables, so need to analyze what to do about that. BTW, do you see any
> > advantages to doing so?
>
> IMO, yes, doing it the extension way has many advantages - it doesn't
> have to touch the core part of postgres, usability will be good -
> whoever requires this solution will use and we can avoid code chunks
> within the core such as if (feature_enabled) { do foo} else { do bar}
> sorts. Since this feature is based on core postgres logical
> replication infrastructure, I think it's worth implementing it as an
> extension first, maybe the extension as a PoC?
>

I don't know if it requires the kind of code you are thinking but I
agree that it is worth considering implementing it as an extension.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-06-10 10:02:56 Re: A proposal to force-drop replication slots to make disabling async/sync standbys or logical replication faster in production environments
Previous Message Laurenz Albe 2022-06-10 09:17:07 Re: Error from the foreign RDBMS on a foreign table I have no privilege on