Re: Parallel worker hangs while handling errors.

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel worker hangs while handling errors.
Date: 2020-08-11 14:38:11
Message-ID: CALDaNm3hi_NKnJLaiXdTZ-us50kexR-eFH0rJ5p+TjV=yeD9Og@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 7, 2020 at 1:34 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Tue, Jul 28, 2020 at 5:35 AM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> > The v4 patch looks good to me. Hang is not seen, make check and make
> > check-world passes. I moved this to the committer for further review
> > in https://commitfest.postgresql.org/29/2636/.
>
> I don't think I agree with this approach. In particular, I don't
> understand the rationale for unblocking only SIGUSR1. Above, Vignesh
> says that he feels that unblocking only that signal would be the right
> approach, but no reason is given. I have two reasons why I suspect
> it's not the right approach. One, it doesn't seem to be what we do
> elsewhere; the only existing cases where we have special handling for
> particular signals are SIGQUIT and SIGPIPE, and those places have
> comments explaining the reason why they are handled in a special way.
> Two, SIGUSR1 is used for a LOT of things: look at all the different
> cases procsignal_sigusr1_handler() checks. If the intention is to only
> allow the things we know are safe, rather than all the signals there
> are, I think this coding utterly fails to achieve that - and for
> reasons that I don't think are really fixable.
>

My intention of blocking only SIGUSR1 over unblocking all signals
mainly because we are already in the error path and we are about to
exit after emitting the error report. I was not sure if we intended to
receive any other signal just before exiting.
The Solution Robert & Tom are suggesting by Calling
BackgroundWorkerUnblockSignals fixes the actual problem.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Sharma 2020-08-11 14:47:30 Re: recovering from "found xmin ... from before relfrozenxid ..."
Previous Message Robert Haas 2020-08-11 14:03:29 Re: recovering from "found xmin ... from before relfrozenxid ..."