Re: Logical Replication WIP

From: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Steve Singer <steve(at)ssinger(dot)info>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers-owner(at)postgresql(dot)org
Subject: Re: Logical Replication WIP
Date: 2016-12-15 09:06:06
Message-ID: f2a315e3-f6dd-7d1a-3d7c-9a26a0dd3ea0@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 13/12/16 01:33, Andres Freund wrote:
>
> On 2016-12-12 09:18:48 -0500, Peter Eisentraut wrote:
>> On 12/8/16 4:10 PM, Petr Jelinek wrote:
>>> On 08/12/16 20:16, Peter Eisentraut wrote:
>>>> On 12/6/16 11:58 AM, Peter Eisentraut wrote:
>>>>> On 12/5/16 6:24 PM, Petr Jelinek wrote:
>>>>>> I think that the removal of changes to ReplicationSlotAcquire() that you
>>>>>> did will result in making it impossible to reacquire temporary slot once
>>>>>> you switched to different one in the session as the if (active_pid != 0)
>>>>>> will always be true for temp slot.
>>>>>
>>>>> I see. I suppose it's difficult to get a test case for this.
>>>>
>>>> I created a test case, saw the error of my ways, and added your code
>>>> back in. Patch attached.
>>>>
>>>
>>> Hi,
>>>
>>> I am happy with this version, thanks for moving it forward.
>>
>> committed
>
> Hm.
>
> /*
> + * Cleanup all temporary slots created in current session.
> + */
> +void
> +ReplicationSlotCleanup()
>
> I'd rather see a (void) there. The prototype has it, but still.
>
>
> +
> + /*
> + * No need for locking as we are only interested in slots active in
> + * current process and those are not touched by other processes.
>
> I'm a bit suspicious of this claim. Without a memory barrier you could
> actually look at outdated versions of active_pid. In practice there's
> enough full memory barriers in the slot creation code that it's
> guaranteed to not be the same pid from before a wraparound though.
>
> I think that doing iterations of slots without
> ReplicationSlotControlLock makes things more fragile, because suddenly
> assumptions that previously held aren't true anymore. E.g. factually
> /*
> * The slot is definitely gone. Lock out concurrent scans of the array
> * long enough to kill it. It's OK to clear the active flag here without
> * grabbing the mutex because nobody else can be scanning the array here,
> * and nobody can be attached to this slot and thus access it without
> * scanning the array.
> */
> is now simply not true anymore. It's probably not harmfully broken, but
> at least you've changed the locking protocol without adapting comments.
>
>

Any thoughts on attached? Yes it does repeated scans which can in theory
be slow but as I explained in the comment, in practice there is not much
need to have many temporary slots active within single session so it
should not be big issue.

I am not quite convinced that all the locking is necessary from the
current logic perspective TBH but it should help prevent mistakes by
whoever changes things in slot.c next.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0002-Improve-behavior-of-ReplicationSlotCleanup.patch text/x-diff 4.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2016-12-15 09:08:25 Re: Quorum commit for multiple synchronous replication.
Previous Message Magnus Hagander 2016-12-15 09:04:37 pg_basebackups and slots