Re: tablesync patch broke the assumption that logical rep depends on?

From: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: tablesync patch broke the assumption that logical rep depends on?
Date: 2017-04-25 19:42:23
Message-ID: 4e91ac45-774e-e8fe-65a3-a79e3915193b@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 22/04/17 22:09, Petr Jelinek wrote:
> On 21/04/17 16:31, Petr Jelinek wrote:
>> On 21/04/17 16:23, Peter Eisentraut wrote:
>>> On 4/21/17 10:11, Petr Jelinek wrote:
>>>> On 21/04/17 16:09, Peter Eisentraut wrote:
>>>>> On 4/20/17 14:29, Petr Jelinek wrote:
>>>>>> + /* Find unused worker slot. */
>>>>>> + if (!w->in_use)
>>>>>> {
>>>>>> - worker = &LogicalRepCtx->workers[slot];
>>>>>> - break;
>>>>>> + worker = w;
>>>>>> + slot = i;
>>>>>> + }
>>>>>
>>>>> Doesn't this still need a break? Otherwise it always picks the last slot.
>>>>>
>>>>
>>>> Yes it will pick the last slot, does that matter though, is the first
>>>> one better somehow?
>>>>
>>>> We can't break because we also need to continue the counter (I think the
>>>> issue that the counter solves is probably just theoretical, but still).
>>>
>>> I see. I think the code would be less confusing if we break the loop
>>> like before and call logicalrep_sync_worker_count() separately.
>>>
>>>> Hmm actually, maybe the if (!w->in_use) should be if (worker == NULL &&
>>>> !w->in_use)?
>>>
>>> That would also do it. But it's getting a bit fiddly.
>>>
>>
>> I just wanted to avoid looping twice, especially since the garbage
>> collecting code has to also do the loop. I guess I'll go with my
>> original coding for this then which was to put retry label above the
>> loop first, then try finding worker slot, if found call the
>> logicalrep_sync_worker_count and if not found do the garbage collection
>> and if we cleaned up something then goto retry.
>>
>
> Here is the patch doing just that.
>

And one more revision which also checks in_use when attaching shared
memory. This is mainly to improve the user visible behavior in
theoretical corner case when the worker is supposed to be cleaned up but
actually still manages to start (it would still fail even without this
change but the error message would be more obscure).

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
Fix-various-concurrency-issues-in-logical-replicatiov4.patch binary/octet-stream 11.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2017-04-25 19:49:56 Re: scram and \password
Previous Message Bruce Momjian 2017-04-25 19:30:12 Re: PG 10 release notes