Re: margay fails assertion in stats/dsa/dsm code

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Marcel Hofstetter <hofstetter(at)jomasoft(dot)ch>
Subject: Re: margay fails assertion in stats/dsa/dsm code
Date: 2022-06-29 04:00:32
Message-ID: CA+hUKGL+DNK5d5ARgFuN4kntZrur68E+DWbcm3iyF_NjTZ4DBQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 29, 2022 at 6:04 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> My first thought was that the return value of the call to
> dsm_impl_op() at the end of dsm_attach() is not checked and that maybe
> it was returning NULL, but it seems like whoever wrote
> dsm_impl_posix() was pretty careful to ereport(elevel, ...) in every
> failure path, and elevel is ERROR here, so I don't see any issue. My
> second thought was that maybe control had escaped from dsm_attach()
> due to an error before we got to the step where we actually map the
> segment, but then the dsm_segment * would be returned to the caller.
> Maybe they could retrieve it later using dsm_find_mapping(), but that
> function has no callers in core.

Thanks for looking. Yeah. I also read through that code many times
and drew the same conclusion.

> So I'm kind of stumped too, but did you by any chance check whether
> there are any DSM-related messages in the logs before the assertion
> failure?

Marcel kindly granted me access to his test machine, where the failure
can be reproduced by running make check lots of times. I eventually
figured out that the problem control flow is ... of course ... the one
path that doesn't ereport(), and that's when errno == EEXIST. That is
a path that is intended to handle DSM_OP_CREATE. Here we are handling
DSM_OP_ATTACH, and I have verified that we're passing in just O_RDWR.
EEXIST is a nonsensical error for shm_open() without flags containing
O_CREAT | O_EXCL (according to POSIX and Solaris's man page).

On this OS, shm_open() opens plain files in /tmp (normally a RAM disk,
so kinda like /dev/shm on Linux), that much I can tell with a plain
old "ls" command. We can also read its long lost open source cousin
(which may be completely different for all I know, but I'd doubt it):

https://github.com/illumos/illumos-gate/blob/master/usr/src/lib/libc/port/rt/shm.c
https://github.com/illumos/illumos-gate/blob/master/usr/src/lib/libc/port/rt/pos4obj.c

Erm. It looks like __pos4obj_lock() could possibly return -1 and
leave errno == EEXIST, if it runs out of retries? Then shm_open()
would return -1, and we'd blow up. However, for that to happen, one
of those "SHM_LOCK_TYPE" files would have to linger for 64 sleep
loops, and I'm not sure why that'd happen, or what to do about it. (I
don't immediately grok what that lock file is even for.)

I suppose this could indicate that the machine and/or RAM disk is
overloaded/swapping and one of those open() or unlink() calls is
taking a really long time, and that could be fixed with some system
tuning. I suppose it's also remotely possible that the process is
getting peppered with signals so that funky shell script-style locking
scheme is interrupted and doesn't really wait very long. Or maybe I
guessed wrong and some other closed source path is to blame *shrug*.

As for whether PostgreSQL needs to do anything, perhaps we should
ereport for this unexpected error as a matter of self-preservation, to
avoid the NULL dereference you'd presumably get on a non-cassert build
with the current coding? Maybe just:

- if (errno != EEXIST)
+ if (op == DSM_OP_ATTACH || errno != EEXIST)
ereport(elevel,
(errcode_for_dynamic_shared_memory(),
errmsg("could not open shared
memory segment \"%s\": %m",

margay would probably still fail until that underlying problem is
addressed, but less mysteriously on our side at least.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-06-29 04:17:33 Re: pg_upgrade allows itself to be run twice
Previous Message David Rowley 2022-06-29 03:23:27 Can we do something to help stop users mistakenly using force_parallel_mode?