Re: BUG #16827: macOS interrupted syscall leads to a crash

From: Ricardo Ungureanu <ricardoungureanu(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16827: macOS interrupted syscall leads to a crash
Date: 2021-01-16 17:30:52
Message-ID: CAFHbkwjPXHgWb15wsmaCMuYOOF2Fm6sBVj9Sv1LbwRav75KsRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Let me give you more context. I am resending this because I forgot to
add the mailing list.
În vin., 15 ian. 2021 la 23:05, Andres Freund <andres(at)anarazel(dot)de> a scris:
>
> Hi,
>
> On 2021-01-15 14:00:03 +0000, PG Bug reporting form wrote:
> > I am using macOS 11.0 and trying to import a large dump into postgresql.
> > Under some circumstances, it crashes while importing.
> > I inspected the logs and found out a system call is interrupted (" LOG:
> > could not open file "pg_wal": Interrupted system call"). Apple has added a
> > new feature in macOS 11.0 to audit security events. I noticed that the
> > kernel, while waiting on a condition variable, if it receives an interrupt,
> > will just pass EINTR (error code 4) back to the usermode program. Your
> > function XLogFileInit does not treat such cases (just ENOENT is checked) and
> > decides to exit with an abort(). I have attached below the crash file
> > generated.
>
> Hm. It's fairly nasty to return EINTR from open() (except if open()ing a
> FIFO or such) - it should normally only happen when blocked. But I'm not
> sure it's *actually* violating any standards / promises made.

There are two kinds of security events which Apple supports: AUTH and
NOTIFY. AUTH means that the system call is blocked (on that condition
variable I mentioned about), and the user mode daemon is asked about
the generated event: "postgres, pid 999, open() on file /path/file
with flags 0x400003" - something like that. The usermode can either
allow or deny the event by replying. If it decides to block this
system call, the return code seen by the target process (in this case,
postgres) is -1 (operation not permitted) .
On the other hand, NOTIFY events will only log the event, without
requesting a verdict (allow or deny).
In my scenario, usermode daemon responsible for auditing these events
is set to ALLOW everything. If I denied the system call I would see in
postgres log "operation not permitted (err 1)" insead of 'Interrupted
system call (err 4)").
To sum up, the open() is blocked, waiting for a verdict from the
usermode, meanwhile an interrupt is triggered, msleep on the condition
variable returns EINTR and this is passed back to postgres as the
return code of open().

> > Apple has added a new feature in macOS 11.0 to audit security
> > events. I noticed that the kernel, while waiting on a condition
> > variable, if it receives an interrupt, will just pass EINTR (error
> > code 4) back to the usermode program.
>
> Does that also happen for close()? Because that can't reasonably be
> handled by userspace (userspace cannot retry because the fd could now
> point to something else in a threaded environment).

Good point, however the close event is not supported as AUTH, only as
NOTIFY. Thus, this cannot happen on close().
Open() and other file system calls are both AUTH and NOTIFY (you can
choose which one to enable).
You can read more about this here[1]

> Greetings,
>
> Andres Freund

Regards,
Ricardo Ungureanu

[1] https://developer.apple.com/documentation/endpointsecurity/es_event_type_t

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Juan José Santamaría Flecha 2021-01-16 21:15:56 Re: BUG #16825: When building on Windows, cl /? retrun 'x64' not AMD64 and the build does not create x64 environment
Previous Message Andres Freund 2021-01-15 21:05:48 Re: BUG #16827: macOS interrupted syscall leads to a crash