Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)
Date: 2015-01-30 23:52:03
Message-ID: 54CC1923.7040204@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/15/2015 03:03 AM, Andres Freund wrote:
> 0002: Use a nonblocking socket for FE/BE communication and block using
> latches.

s/suceeds/succeeds/

> 0004: Process 'die' interrupts while reading/writing from the client socket.

> + * Check for interrupts here, in addition to secure_write(),
> + * because a interrupted write in secure_raw_write() can possibly
> + * only return to here, not to secure_write().
> */
> + ProcessClientWriteInterrupt(true);

Took me a while to parse that sentence. How about:

Check for interrupts here, in addition to secure_write(), because an
interrupted write in secure_raw_write() will return here, and we cannot
return to secure_write() until we've written something.

> 0005: Move deadlock and other interrupt handling in proc.c out of signal handlers.
>
> I'm surprised how comparatively simple this turned out to be. For
> robustness and understanding I think this is a great improvement.
>
> New and not reviewed at all. Needs significant review. But works
> in my simple testcases.

> @@ -799,6 +791,7 @@ ProcKill(int code, Datum arg)
> proc = MyProc;
> MyProc = NULL;
> DisownLatch(&proc->procLatch);
> + SetLatch(MyLatch);
>
> SpinLockAcquire(ProcStructLock);
>
> @@ -868,6 +861,7 @@ AuxiliaryProcKill(int code, Datum arg)
> proc = MyProc;
> MyProc = NULL;
> DisownLatch(&proc->procLatch);
> + SetLatch(MyLatch);
>
> SpinLockAcquire(ProcStructLock);

These SetLatch calls are unnecessary. SwitchBackToLocalLatch() already
sets the latch. (According to the comment in SwitchToSharedLatch() even
that should not be necessary.)

> 0006: Don't allow immediate interupts during authentication anymore.

In commit message: s/interupts/interrupts/.

> @@ -80,9 +73,6 @@ md5_crypt_verify(const Port *port, const char *role, char *client_pass,
> if (*shadow_pass == '\0')
> return STATUS_ERROR; /* empty password */
>
> - /* Re-enable immediate response to SIGTERM/SIGINT/timeout interrupts */
> - ImmediateInterruptOK = true;
> - /* And don't forget to detect one that already arrived */
> CHECK_FOR_INTERRUPTS();
>
> /*

That lone CHECK_FOR_INTERRUPTS() that remains looks pretty randomly
placed now that the "ImmediateInterruptOK = true" is gone. I would just
remove it. Add one to the caller if it's needed, but it probably isn't.

> 0007: Remove the option to service interrupts during PGSemaphoreLock().

s/don't/doesn't/ in commit message.

> Questions I have about the series right now:
>
> 1) Do others agree that getting rid of ImmediateInterruptOK is
> worthwile. I personally think that we really should pursue that
> aggressively as a goal.

Yes.

> 2) Does anybody see a significant problem with the reduction of cases
> where we immediately react to authentication_timeout? Since we still
> react immediately when we're waiting for the client (except maybe in
> sspi/kerberos?) I think that's ok. The waiting cases are slow
> ident/radius/kerberos/ldap/... servers. But we really shouldn't jump
> out of the relevant code anyway.

Yeah, I think that's OK. Doing those things with
ImmediateInterruptOK=true was quite scary, and we need to stop doing
that. It would be nice to have a wholesale solution for those lookups
but I don't see one. So all the ident/radius/kerberos/ldap lookups will
need to be done in a non-blocking way to have them react to the timeout.
That requires some work, but we can leave that to a followup patch, if
someone wants to write it.

Overall, 1-8 look good to me, except for that one incomplete comment in
ProcessClientWriteInterrupt() I mentioned in a separate email. I haven't
reviewed patches 9 and 10 yet.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sawada Masahiko 2015-01-31 05:00:45 Re: Proposal: knowing detail of config files via SQL
Previous Message Peter Geoghegan 2015-01-30 23:49:12 Re: PageRepairFragmentation performance