Re: Logical replication in the same cluster

From: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Logical replication in the same cluster
Date: 2017-04-27 08:08:03
Message-ID: be583b22-a343-5663-c3e5-01bc37ce3ad7@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27/04/17 04:50, Tom Lane wrote:
> Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> writes:
>>>> If that's a predictable deadlock, I think a minimum expectation is that
>>>> the system should notice it and throw an error, not just hang.
>
>> We had some discussions early on about detecting connections to the same
>> server, but it's not entirely clear how to do that and it didn't seem
>> worth it at the time.
>
> I wonder whether we actually need to detect connections to the same
> server per se. I'm thinking about the one end taking some special
> heavyweight lock, and the other end taking the same lock, which would
> generally be free as long as the two ends aren't on the same server.
> Cascading replication might be a problem though ...
>

Well cascading might not be problem. I mean this issue exists only
during the slot creation which is one time operation. Which is why the
workaround solves it. But I don't see what we could lock that's common
between publisher and subscriber unless we invent some database object
specifically for this purpose.

My idea in the original thread was to put the info about xid and sysid
somewhere in shmem when creating subscription and checking that on the
other side if the sysid is same as local one and the xid is active. It
would serialize the subscription creation but I don't see that as big
issue, it's not like it's common to create thousands of them in parallel
nor it is something where we care about shaving microseconds of runtime.

Back when writing the original patch set, I was also playing with the
idea of having CREATE SUBSCRIPTION do multiple committed steps in
similar fashion to CREATE INDEX CONCURRENTLY but that leaves mess behind
on failure which also wasn't very popular outcome. I wonder how bad it
would be if we created all the stuff for subscription but in disabled
form, then committed, then created slot outside of tx (slot creation is
not transactional anyway) and then switched the subscription to enabled
(if needed) in next tx. It would still leave subscription behind on
failure but a) user would see the failure, b) the subscription would be
inactive so no active harm from it. We also already prevent running
CREATE SUBSCRIPTION inside transaction block when automatic slot
creation is chosen so there is no difference from that perspective.

Just for info, in pglogical, we solve this by having the worker create
slot, not the user command, so then it just works. The reason why I
didn't do this in core is that from practice it does not seem to be very
user friendly in case there are errors (not enough slots free,
connecting not only to same server but same db, etc) because they will
only see the errors in logs after the fact (and often they don't look).
I am already unhappy about the fact that we have no facility for
bgworker to save the last error before dying into place that is
accessible via SQL and I'd rather not hide even more errors in the log.

Note that the workaround for all of this is not all that complex, you do
same thing (create slot manually) you'd do for physical replication with
slots.

Thoughts?

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitriy Sarafannikov 2017-04-27 08:08:30 [PROPOSAL] Use SnapshotAny in get_actual_variable_range
Previous Message Amit Langote 2017-04-27 08:00:04 Re: pg_dump emits ALTER TABLE ONLY partitioned_table