Re: DSM robustness failure (was Re: Peripatus/failures)

From: Larry Rosenman <ler(at)lerctr(dot)org>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: DSM robustness failure (was Re: Peripatus/failures)
Date: 2018-10-18 05:02:31
Message-ID: 20181018050231.d4xt3or5wg2g2npo@ler-imac.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 17, 2018 at 08:19:52PM -0500, Larry Rosenman wrote:
> On Thu, Oct 18, 2018 at 02:17:14PM +1300, Thomas Munro wrote:
> > On Thu, Oct 18, 2018 at 1:10 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > ... However, I'm still slightly interested in how it
> > > was that that broke DSM so thoroughly ...
> >
> > Me too. Frustratingly, that vm object might still exist on Larry's
> > machine if it hasn't been rebooted (since we failed to shm_unlink()
> > it), so if we knew its name we could write a program to shm_open(),
> > mmap(), dump out to a file for analysis and then we could work out
> > which of the sanity tests it failed and maybe get some clues.
> > Unfortunately it's not in any of our logs AFAIK, and I can't see any
> > way to get a list of existing shm_open() objects from the kernel.
> > From sys/kern/uipc_shm.c:
> >
> > * TODO:
> > *
> > * (1) Need to export data to a userland tool via a sysctl. Should ipcs(1)
> > * and ipcrm(1) be expanded or should new tools to manage both POSIX
> > * kernel semaphores and POSIX shared memory be written?
> >
> > Gah. So basically that's hiding in shm_dictionary in the kernel and I
> > don't know a way to look at it from userspace (other than trying to
> > open all 2^32 random paths we're capable of generating).
>
> It has *NOT* been rebooted. I can give y'all id's if you want to go
> poking around.
Let me know soon(ish) if any of you want to poke at this machine, as I'm
likely to forget and reboot it.....

>
>
> >
> > --
> > Thomas Munro
> > http://www.enterprisedb.com
>
> --
> Larry Rosenman http://www.lerctr.org/~ler
> Phone: +1 214-642-9640 E-Mail: ler(at)lerctr(dot)org
> US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: ler(at)lerctr(dot)org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yang Jie 2018-10-18 05:22:06 Re: Implementation of Flashback Query
Previous Message Andres Freund 2018-10-18 04:56:59 Re: Checkpoint start logging is done inside critical section