Re: "could not reattach to shared memory" on buildfarm member dory

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: "could not reattach to shared memory" on buildfarm member dory
Date: 2018-04-30 17:07:49
Message-ID: 17897.1525108069@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

[ Thanks to Stephen for cranking up a continuous build loop on dory ]

Noah Misch <noah(at)leadboat(dot)com> writes:
> On Tue, Apr 24, 2018 at 11:37:33AM +1200, Thomas Munro wrote:
>> Maybe try asking what's mapped there with VirtualQueryEx() on failure?

> +1. An implementation of that:
> https://www.postgresql.org/message-id/20170403065106.GA2624300%40tornado.leadboat.com

So I tried putting in that code, and it turns the problem from something
that maybe happens in every third buildfarm run or so, to something that
happens at least a dozen times in a single "make check" step. This seems
to mean that either EnumProcessModules or GetModuleFileNameEx is itself
allocating memory, and sometimes that allocation comes out of the space
VirtualFree just freed :-(.

So we can't use those functions. We have however proven that no new
module gets loaded during VirtualFree or MapViewOfFileEx, so there
doesn't seem to be anything more to be learned from them anyway.

What it looks like to me is that MapViewOfFileEx allocates some memory and
sometimes that comes out of the wrong place. This is, um, unfortunate.
It also appears that VirtualFree might sometimes allocate some memory,
and that'd be even more unfortunate, but it's hard to be certain; the
blame might well fail on VirtualQuery instead. (Ain't Heisenbugs fun?)

The solution I was thinking about last night was to have
PGSharedMemoryReAttach call MapViewOfFileEx to map the shared memory
segment at an unspecified address, then unmap it, then call VirtualFree,
and finally call MapViewOfFileEx with the real target address. The idea
here is to get these various DLLs to set up any memory allocation pools
they're going to set up before we risk doing VirtualFree. I am not,
at this point, convinced this will fix it :-( ... but I'm not sure what
else to try.

In any case, it's still pretty unclear why dory is showing this problem
and other buildfarm members are not. whelk for instance seems to be
loading all the same DLLs and more besides.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2018-04-30 17:18:52 BufFileSize() doesn't work on a "shared" BufFiles
Previous Message Andres Freund 2018-04-30 16:09:45 Re: Postgres, fsync, and OSs (specifically linux)