windows 8 RTM compatibility issue (could not reserve shared memory region for child)

From: Dave Vitek <dvitek(at)grammatech(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Cc: v-seishi(at)microsoft(dot)com
Subject: windows 8 RTM compatibility issue (could not reserve shared memory region for child)
Date: 2012-09-05 03:45:47
Message-ID: 5046CAEB.4010600@grammatech.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello pgsql-bugs list,

I have attached a patch file that I believe resolves a compatibility
issue with Windows 8 RTM and postgresql. The impatient might want to
just read the patch, this email is longer than it probably should be. I
have CC'd Seiko Ishida who expressed an interest in Windows 8
compatibility on this list about a year ago.

We test postgres pretty heavily at my place of work (probably thousands
of DBs created and exercised each day) on a number of platforms. We've
been doing compatibility testing with the Windows 8 previews and
everything has been working well. We are using the latest postgres release.

However, last week we upgraded from a preview version to the RTM version
of Windows 8 x64, and it is clear that something changed. Since
upgrading, we have been getting this error message a few times a day.
Still very rare, but it never happened before the upgrade.

LOG: could not reserve shared memory region (addr=0000000001410000) for child
0000000000000F8C: 487
LOG: could not fork new process for connection: A blocking operation was
interrupted by a call to WSACancelBlockingCall.

This corresponds to VirtualAllocEx failing with ERROR_INVALID_ADDRESS
inside win32_shmem.c (search for the error message).

Postgres uses a shared memory block to do much of its IPC. This shared
memory block presumably stores pointers to itself, and so must be
allocated at the same address inside every postgres process. In order
to maximize the probability that this address will be available in child
processes, the address should be reserved as early as possible in the
lifetime of the child process (before the address space gets polluted).
In order to achieve this goal, the postmaster starts its children in a
suspended state and reserves the address before any code has executed in
the child process.

However, there are a bunch of chunks of the virtual address space
already reserved even when the child process is in this suspended
state. At least some of them are memory mapped images of binaries
(duh). I believe VirtualAllocEx is failing because something is already
mapped (in the child) to the address the postmaster wants the shared
memory segment to live at.

I wrote a small program that repeatedly starts postgres.exe in suspended
mode and then tries to VirtualAllocEx 0x1410000. The address is never
blocked on Windows 7, but is blocked 2% of the time on Windows 8. I
attached windbg to the troublesome postgres process and used "!vadump
-v" to see that there is a file mapped to the contentious address while
postgres is in the suspended state. I don't know if the failure rate is
this bad for all addresses or just this one, but the possibility of
conflict exists, since the postmaster was willing to use this address in
at least one run.

So why hasn't this ever happened before? I'm guessing that ASLR got
better in the latest windows 8 patch, or maybe there's just more stuff
in the virtual address space of a newborn process.

The postmaster originally decides where to place the shared memory
segment by letting Windows (MapViewOfFileEx) choose where to put it. So
if the postmaster ends up using address 0x1410000, and then the
postgres.exe image (for example) gets mapped to that same address in the
child, you'll end up with the error message above.

I assume Windows changed so that the addresses in use inside a newborn
process can now conflict with the addresses returned by
MapViewOfFileEx(..., NULL). These sets must have been disjoint in
previous versions of windows, and postgres was relying on that behavior.

One straightforward "fix" is to specify a hardcoded address to
MapViewOfFileEx instead of NULL. This address should be carefully
selected such that it is in an area disjoint from the portions of the
address space that are potentially reserved in a newborn process, and
also unlikely to be in use inside the postmaster when it first maps the
shared memory. This is pretty trivial to do for a particular
version/configuration of Windows. However, I see no future-proof
solution (besides making the shared segment position independent). If
the hardcoded address is not available, you can always fall back on the
current behavior.

On 64-bit versions of Windows, processes that do not use more than 4G or
so of address space seem to always have a huge hole from about 00000000
80000000 ... 00000700 00000000. Note that you cannot reserve
addresses above 8TB, so it would need to go somewhere in this hole,
above 4G is probably preferable.

32-bit Windows 8 also exists. We haven't been testing on it, and so I
can't confirm that the problem exists there. Assuming it does, 32-bit
processes are likely to be trickier since address space is more scarce.
In practice, it appears that there is usually a big hole from 10000000
... 70000000.

There is a security problem with the fix I outline above. It bypasses
ASLR to a limited degree, since the shared memory would likely end up
always living at the same address. I am not certain that MapViewOfFile
even tries to be unpredictable, but let's assume it does or will be someday.

This security problem can be addressed by adding a random number to the
hardcoded address. Interfacing with a suitable entropy source/PRNG
might prove to be a PITA, but there is a way of avoiding that. We can
invoke MapViewOfFile once with NULL in order to get a "random address"
and then sum the least significant bits of that with our hardcoded base
address to get the preferred address for the shared segment. This way
we end up with an address that is no less secure than the one currently
returned by MapViewOfFile, insofar as MapViewOfFile doesn't select high
addresses.

I've attached a patch that implements the stuff above. I can share the
code for the program that tests whether an address is reliably available
in a newborn postgres process, if anyone is interested.

- Dave Vitek

Attachment Content-Type Size
shmem.patch text/plain 2.7 KB

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2012-09-05 04:35:48 Re: BUG #7520: regexp_matches does not work as expected
Previous Message sbasurto 2012-09-05 01:31:34 BUG #7520: regexp_matches does not work as expected