Re: EINTR in ftruncate()

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: EINTR in ftruncate()
Date: 2022-07-07 05:58:10
Message-ID: CA+hUKGJSm-nq8s+_59zb7NbFQF-OS=xTnTAiGLrQpuSmU2y_1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 7, 2022 at 9:05 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> On Thu, Jul 7, 2022 at 9:03 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > On 2022-07-07 08:56:33 +1200, Thomas Munro wrote:
> > > On Thu, Jul 7, 2022 at 8:39 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > > So I think we need: 1) block most signals, 2) a retry loop *without*
> > > > interrupt checks.

Here's a draft patch that tries to explain all this in the commit
message and comments.

Even if we go with this approach now, I think it's plausible that we
might want to reconsider this yet again one day, perhaps allocating
memory with some future asynchronous infrastructure while still
processing interrupts. It's not very nice to hold up recovery or
ProcSignalBarrier for long operations.

I'm a little unclear about ftruncate() here. I don't expect it to
report EINTR in other places where we use it (ie to make a local file
on a non-"slow device" smaller), because I expect that to be like
read(), write() etc which we don't wrap in EINTR loops. Here you've
observed EINTR when messing around with a debugger*. It seems
inconsistent to put posix_fallocate() in an EINTR retry loop for the
benefit of debuggers, but not ftruncate(). But perhaps that's good
enough, on the theory that posix_fallocate(1GB) is a very large target
and you have a decent chance of hitting it.

Another observation while staring at that ftruncate(): It's entirely
redundant on Linux, because we only ever call dsm_impl_posix_resize()
to make the segment bigger. Before commit 3c60d0fa (v12) it was
possible to resize a segment to be smaller with dsm_resize(), so you
needed one or t'other depending on the requested size and we just
called both, but dsm_resize() wasn't ever used AFAIK and didn't even
work on all DSM implementations, among other problems, so we ripped it
out. So... on master at least, we could also change the #ifdef to be
either-or. While refactoring like that, I think we might as well also
rearrange the code so that the wait event is reported also for other
OSes, just in case it takes a long time. See 0002 patch.

*It's funny that ftruncate() apparently doesn't automatically restart
for ptrace SIGCONT on Linux according to your report, while poll()
does according to my experiments, even though the latter is one of the
never-restart functions (it doesn't on other OSes I hack on, and you
feel the difference when debugging missing wakeup type bugs...).
Random implementation details...

Attachment Content-Type Size
0001-Block-signals-while-allocating-DSM-memory.patch text/x-patch 4.2 KB
0002-Remove-redundant-ftruncate-when-allocating-memory.patch text/x-patch 3.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2022-07-07 06:22:55 Re: pg_upgrade allows itself to be run twice
Previous Message Pavel Stehule 2022-07-07 05:49:47 Re: Schema variables - new implementation for Postgres 15