Re: src/test/subscription/t/002_types.pl hanging on particular environment

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: src/test/subscription/t/002_types.pl hanging on particular environment
Date: 2017-09-20 04:16:23
Message-ID: CAMsr+YGGb8LwgD8vcBYdiepmdJSMBVAFekHm8cWnxifYZy9O=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20 September 2017 at 11:53, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
> > On 19 September 2017 at 18:04, Petr Jelinek <
> petr(dot)jelinek(at)2ndquadrant(dot)com>
> > wrote:
> >> If you are asking why they are not identified by the
> >> BackgroundWorkerHandle, then it's because it's private struct and can't
> >> be shared with other processes so there is no way to link the logical
> >> worker info with bgworker directly.
>
> > I really want BackgroundWorkerHandle to be public, strong +1 from me.
>
> I'm confused about what you think that would accomplish. AFAICS, the
> point of BackgroundWorkerHandle is to allow the creator/requestor of
> a bgworker to verify whether or not the slot in question is still
> "owned" by its request. This is necessarily not useful to any other
> process, since they didn't make the request.
>

I'm using shm_mq in a sort of "connection accepter" role, where a pool of
shm_mq's are attached to by a long lived bgworker. Other backends, both
bgworkers and normal user backends, can find a free slot and attach to it
to talk to the long lived bgworker. These other backends are not
necessarily started by the long lived worker, so it doesn't have a
BackgroundWorkerHandle for them.

Currently, if the long lived bgworker dies and a peer attempts to attach to
the slot, they'll hang forever in shm_mq_wait_internal, because it cannot
use shm_mq_set_handle as described in
https://www.postgresql.org/message-id/E1XbwOs-0002Fd-H9%40gemulon.postgresql.org
to protect its self.

See also thread
https://www.postgresql.org/message-id/CAMsr%2BYHmm%3D01LsuEYR6YdZ8CLGfNK_fgdgi%2BQXUjF%2BJeLPvZQg%40mail.gmail.com
.

If the BackgroundWorkerHandle for the long-lived bgworker is copied to a
small static control shmem segment, the connecting workers can use that to
reliably bail out if the long-lived worker dies.

> The thought I had in mind upthread was to get rid of logicalrep slots
> in favor of expanding the underlying bgworker slot with some additional
> fields that would carry whatever extra info we need about a logicalrep
> worker. Such fields could also be repurposed for additional info about
> other kinds of bgworkers, when (not if) we find out we need that.
>

That sounds OK to me personally for in-core logical rep, but it's really
Petr and Peter who need to have a say here, not me.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-09-20 04:42:15 Re: [HACKERS] Re: pgsql: Make new crash restart test a bit more robust.
Previous Message Amit Kapila 2017-09-20 04:15:14 Re: why not parallel seq scan for slow functions