Re: Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)
Date: 2015-01-31 11:11:18
Message-ID: 20150131111118.GM24213@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-01-31 00:52:03 +0100, Heikki Linnakangas wrote:
> On 01/15/2015 03:03 AM, Andres Freund wrote:
> >0004: Process 'die' interrupts while reading/writing from the client socket.
>
> >+ * Check for interrupts here, in addition to secure_write(),
> >+ * because a interrupted write in secure_raw_write() can possibly
> >+ * only return to here, not to secure_write().
> > */
> >+ ProcessClientWriteInterrupt(true);
>
> Took me a while to parse that sentence. How about:
>
> Check for interrupts here, in addition to secure_write(), because an
> interrupted write in secure_raw_write() will return here, and we cannot
> return to secure_write() until we've written something.

Yep, that's better.

> >0005: Move deadlock and other interrupt handling in proc.c out of signal handlers.
> >
> > I'm surprised how comparatively simple this turned out to be. For
> > robustness and understanding I think this is a great improvement.
> >
> > New and not reviewed at all. Needs significant review. But works
> > in my simple testcases.
>
> >@@ -799,6 +791,7 @@ ProcKill(int code, Datum arg)
> > proc = MyProc;
> > MyProc = NULL;
> > DisownLatch(&proc->procLatch);
> >+ SetLatch(MyLatch);
> >
> > SpinLockAcquire(ProcStructLock);
> >
> >@@ -868,6 +861,7 @@ AuxiliaryProcKill(int code, Datum arg)
> > proc = MyProc;
> > MyProc = NULL;
> > DisownLatch(&proc->procLatch);
> >+ SetLatch(MyLatch);
> >
> > SpinLockAcquire(ProcStructLock);
>
> These SetLatch calls are unnecessary. SwitchBackToLocalLatch() already sets
> the latch. (According to the comment in SwitchToSharedLatch() even that
> should not be necessary.)

Ick, that's some debugging leftovers where I was trying to find a bug
that was fundamentally caused by the missing barriers in the latch
code...

> >0006: Don't allow immediate interupts during authentication anymore.
>
> In commit message: s/interupts/interrupts/.
>
> >@@ -80,9 +73,6 @@ md5_crypt_verify(const Port *port, const char *role, char *client_pass,
> > if (*shadow_pass == '\0')
> > return STATUS_ERROR; /* empty password */
> >
> >- /* Re-enable immediate response to SIGTERM/SIGINT/timeout interrupts */
> >- ImmediateInterruptOK = true;
> >- /* And don't forget to detect one that already arrived */
> > CHECK_FOR_INTERRUPTS();
> >
> > /*
>
> That lone CHECK_FOR_INTERRUPTS() that remains looks pretty randomly placed
> now that the "ImmediateInterruptOK = true" is gone. I would just remove it.
> Add one to the caller if it's needed, but it probably isn't.

There's some method to the madness here actually - we just did some
database stuff that could in theory take a while... Whether it's
worthwhile to leave it here I'm not sure.

> >2) Does anybody see a significant problem with the reduction of cases
> > where we immediately react to authentication_timeout? Since we still
> > react immediately when we're waiting for the client (except maybe in
> > sspi/kerberos?) I think that's ok. The waiting cases are slow
> > ident/radius/kerberos/ldap/... servers. But we really shouldn't jump
> > out of the relevant code anyway.
>
> Yeah, I think that's OK. Doing those things with ImmediateInterruptOK=true
> was quite scary, and we need to stop doing that. It would be nice to have a
> wholesale solution for those lookups but I don't see one. So all the
> ident/radius/kerberos/ldap lookups will need to be done in a non-blocking
> way to have them react to the timeout. That requires some work, but we can
> leave that to a followup patch, if someone wants to write it.

Ok, cool.

> Overall, 1-8 look good to me, except for that one incomplete comment in
> ProcessClientWriteInterrupt() I mentioned in a separate email.

Thanks for the review! I plan to push the fixed up versions sometime
next week.

> I haven't reviewed patches 9 and 10 yet.

Yea, those aren't as close to being ready (and thus marked WIP).

I'd like to do the lwlock stuff for 9.5 because then any sane setup
(i.e. unless spinlocks are emulated) doesn't need any semaphores
anymore, which makes setup easier. Right now users need to adjust system
settings when changing max_connctions upwards or run multiple
clusters. But it's not really related to getting rid of
ImmediateInterruptOK.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-01-31 11:36:37 Re: Safe memory allocation functions
Previous Message Marc Mamin 2015-01-31 11:03:15 Re: tablespaces inside $PGDATA considered harmful