Re: tablesync patch broke the assumption that logical rep depends on?

From: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: tablesync patch broke the assumption that logical rep depends on?
Date: 2017-04-19 02:13:36
Message-ID: fa387e24-0e26-c02d-ef16-7e46ada200dd@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 17/04/17 20:09, Peter Eisentraut wrote:
> On 4/16/17 22:01, Noah Misch wrote:
>> This PostgreSQL 10 open item is past due for your status update. Kindly send
>> a status update within 24 hours, and include a date for your subsequent status
>> update. Refer to the policy on open item ownership:
>> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
>
> I think we're not really sure yet what to do about this. Discussion is
> ongoing. I'll report back on Wednesday.
>

So my idea was to add some kind of inuse flag. This turned out to be bit
more complicated in terms of how to clean it than I would have hoped.
This is due to the fact that there is no way to reliably tell if worker
has failed to start if the parent worker crashed while waiting.

My solution to that is to use similar logic to autovacuum where we use
timeout for worker to attach to shmem. We do this only if there is no
free slot found when launch of replication worker was requested.

While working on this patch I also noticed other subtle concurrency
issues and fixed them as well - stopping worker that hasn't finished
starting yet wasn't completely concurrency safe and limiting sync
workers per subscription theoretically wasn't either (although I don't
think it could happen in practice).

I do wonder now though if it's such a good idea to have the
BackgroundWorkerHandle private to the bgworker.c given that this is the
3rd time (twice before was outside of postgres core) I had to write
similar generation mechanism that it uses for unique bgworker
authentication inside the process which started it. It would have been
much easier if I could just save the BackgroundWorkerHandle itself to
shmem so it could be used across processes instead of having to reinvent
it every time.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Fix-various-concurrency-issues-in-logical-replicatio.patch text/plain 10.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2017-04-19 02:18:18 Re: some review comments on logical rep code
Previous Message Jeff Janes 2017-04-19 02:09:00 Re: Failed recovery with new faster 2PC code