Re: Applying logical replication changes by more than one process

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Applying logical replication changes by more than one process
Date: 2016-03-21 13:08:54
Message-ID: 56EFF266.8090708@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 21/03/16 13:44, Konstantin Knizhnik wrote:
>
>
> On 21.03.2016 15:10, Petr Jelinek wrote:
>> Hi,
>>
>> On 19/03/16 11:46, Konstantin Knizhnik wrote:
>>> Hi,
>>>
>>> I am trying to use logical replication mechanism in implementation of
>>> PostgreSQL multimaster and faced with one conceptual problem.
>>> Originally logical replication was intended to support asynchronous
>>> replication. In this case applying changes by single process should not
>>> be a bottleneck.
>>> But if we are using distributed transaction manager to provide global
>>> consistency, then applying transaction by one worker leads to very bad
>>> performance and what is worser: cause unintended serialization of
>>> transactions, which is not taken in account by distributed deadlock
>>> detection algorithm and so can cause
>>> undetected deadlocks.
>>>
>>> So I have implemented pool of background workers which can apply
>>> transactions concurrently.
>>> It works and shows acceptable performance. But now I am thinking about
>>> HA and tracking origin LSNs which are needed to correctly specify slot
>>> position in case of recovery. And there is a problem: as far as I
>>> understand to correctly record origin LSN in WAL and advance slot
>>> position it is necessary to setup session
>>> using replorigin_session_setup. It is not so convenient in case of using
>>> pool of background workers, because we have to setup session for each
>>> commit.
>>> But the main problem is that for each slot session can be associated
>>> only with one process:
>>>
>>> else if (curstate->acquired_by != 0)
>>> {
>>> ereport(ERROR,
>>> (errcode(ERRCODE_OBJECT_IN_USE),
>>> errmsg("replication identifier %d is already active for
>>> PID %d",
>>> curstate->roident, curstate->acquired_by)));
>>> }
>>>
>>> Which once again means that there can be only one process applying
>>> changes.
>>>
>>
>> That's not true, all it means is that you can do
>> replorigin_session_setup for same origin only in one process but you
>> don't need to have it setup for session to update it, the
>> replorigin_advance() works just fine.
>
> But RecordTransactionCommit is using replorigin_session_advance, not
> replorigin_advance.

Only when the origin is actually setup for the current session. You need
to call the replorigin_advance yourself from your apply code.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chapman Flack 2016-03-21 13:12:39 Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)
Previous Message Alexander Korotkov 2016-03-21 12:53:47 Re: dealing with extension dependencies that aren't quite 'e'