Re: Applying logical replication changes by more than one process

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Applying logical replication changes by more than one process
Date: 2016-03-21 13:18:27
Message-ID: 56EFF4A3.3040609@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 21/03/16 14:15, Andres Freund wrote:
>
>
> On March 21, 2016 2:08:54 PM GMT+01:00, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
>> On 21/03/16 13:44, Konstantin Knizhnik wrote:
>>>
>>>
>>> On 21.03.2016 15:10, Petr Jelinek wrote:
>>>> Hi,
>>>>
>>>> On 19/03/16 11:46, Konstantin Knizhnik wrote:
>>>>> Hi,
>>>>>
>>>>> I am trying to use logical replication mechanism in implementation
>> of
>>>>> PostgreSQL multimaster and faced with one conceptual problem.
>>>>> Originally logical replication was intended to support asynchronous
>>>>> replication. In this case applying changes by single process should
>> not
>>>>> be a bottleneck.
>>>>> But if we are using distributed transaction manager to provide
>> global
>>>>> consistency, then applying transaction by one worker leads to very
>> bad
>>>>> performance and what is worser: cause unintended serialization of
>>>>> transactions, which is not taken in account by distributed deadlock
>>>>> detection algorithm and so can cause
>>>>> undetected deadlocks.
>>>>>
>>>>> So I have implemented pool of background workers which can apply
>>>>> transactions concurrently.
>>>>> It works and shows acceptable performance. But now I am thinking
>> about
>>>>> HA and tracking origin LSNs which are needed to correctly specify
>> slot
>>>>> position in case of recovery. And there is a problem: as far as I
>>>>> understand to correctly record origin LSN in WAL and advance slot
>>>>> position it is necessary to setup session
>>>>> using replorigin_session_setup. It is not so convenient in case of
>> using
>>>>> pool of background workers, because we have to setup session for
>> each
>>>>> commit.
>>>>> But the main problem is that for each slot session can be
>> associated
>>>>> only with one process:
>>>>>
>>>>> else if (curstate->acquired_by != 0)
>>>>> {
>>>>> ereport(ERROR,
>>>>> (errcode(ERRCODE_OBJECT_IN_USE),
>>>>> errmsg("replication identifier %d is already active
>> for
>>>>> PID %d",
>>>>> curstate->roident, curstate->acquired_by)));
>>>>> }
>>>>>
>>>>> Which once again means that there can be only one process applying
>>>>> changes.
>>>>>
>>>>
>>>> That's not true, all it means is that you can do
>>>> replorigin_session_setup for same origin only in one process but you
>>>> don't need to have it setup for session to update it, the
>>>> replorigin_advance() works just fine.
>>>
>>> But RecordTransactionCommit is using replorigin_session_advance, not
>>> replorigin_advance.
>>
>> Only when the origin is actually setup for the current session. You
>> need
>> to call the replorigin_advance yourself from your apply code.
>
> That's problematic from a durability POV.
>

Huh? How come?

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-03-21 13:25:17 Re: Applying logical replication changes by more than one process
Previous Message Andres Freund 2016-03-21 13:15:39 Re: Applying logical replication changes by more than one process