Re: EINTR in ftruncate()

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: EINTR in ftruncate()
Date: 2022-07-01 20:29:44
Message-ID: 20220701202944.xkuroe3kdc5izykd@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-07-01 19:55:16 +0200, Alvaro Herrera wrote:
> On 2022-Jul-01, Andres Freund wrote:
>
> > On 2022-07-01 17:41:05 +0200, Alvaro Herrera wrote:
> > > Nicola Contu reported two years ago to pgsql-general[1] that they were
> > > having sporadic query failures, because EINTR is reported on some system
> > > call. I have been told that the problem persists, though it is very
> > > infrequent. I propose the attached patch. Kyotaro proposed a slightly
> > > different patch which also protects write(), but I think that's not
> > > necessary.
> >
> > What is the reason for the || ProcDiePending || QueryCancelPending bit? What
> > if there's dsm operations intentionally done while QueryCancelPending?
>
> That mirrors the test for the other block in that function, which was
> added by 63efab4ca139, whose commit message explains:
>
> Allow DSM allocation to be interrupted.
>
> Chris Travers reported that the startup process can repeatedly try to
> cancel a backend that is in a posix_fallocate()/EINTR loop and cause it
> to loop forever. Teach the retry loop to give up if an interrupt is
> pending. Don't actually check for interrupts in that loop though,
> because a non-local exit would skip some clean-up code in the caller.

That whole approach seems quite wrong to me. At the absolute very least the
code needs to check if interrupts are being processed in the current context
before just giving up due to ProcDiePending || QueryCancelPending.

I'm very unconvinced this ought to be fixed in dsm_impl_posix_resize(), rather
than the startup process signalling.

There is an argument for allowing more things to be cancelled, but we'd need a
retry loop for the !INTERRUPTS_CAN_BE_PROCESSED() case.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-07-01 20:40:45 Re: Making Vars outer-join aware
Previous Message Finnerty, Jim 2022-07-01 20:19:48 Re: Making Vars outer-join aware