Re: retry shm attach for windows (WAS: Re: OK, so culicidae is *still* broken)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: retry shm attach for windows (WAS: Re: OK, so culicidae is *still* broken)
Date: 2017-07-10 14:08:53
Message-ID: 14979.1499695733@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Noah Misch <noah(at)leadboat(dot)com> writes:
> On Mon, Jun 05, 2017 at 09:56:33AM -0400, Tom Lane wrote:
>> Yeah, being able to reproduce the problem reliably enough to say whether
>> it's fixed or not is definitely the sticking point here. I have some
>> ideas about that: ...

> I tried this procedure without finding a single failure.

Thanks for testing! But apparently we still lack some critical part
of the recipe for getting failures in the wild.

> I watched the ensuing memory maps, which led me to these interesting facts:

> An important result of the ASLR design in Windows Vista is that some address
> space layout parameters, such as PEB, stack, and heap locations, are
> selected once per program execution. Other parameters, such as the location
> of the program code, data segment, BSS segment, and libraries, change only
> between reboots.
> ...
> This offset is selected once per reboot, although we’ve uncovered at least
> one other way to cause this offset to be reset without a reboot (see
> Appendix II).
> -- http://www.symantec.com/avcenter/reference/Address_Space_Layout_Randomization.pdf

That is really interesting, though I'm not quite sure what to make of it.
It suggests though that you might need certain types of antivirus products
to be active, to create more variability in the initial process address
space layout than would happen from Windows ASLR alone.

> I recommend pushing your patch so the August back-branch releases have it.
> One can see by inspection that your patch has negligible effect on systems
> healthy today. I have a reasonable suspicion it will help some systems. If
> others remain broken after this, that fact will provide a useful clue.

Okay, so that leaves us with a decision to make: push it into beta2, or
wait till after wrap? I find it pretty scary to push a patch with
portability implications so soon before wrap, but a quick look at the
buildfarm suggests we'd get runs from 3 or 4 Windows members before the
wrap, if I do it quickly. If we wait, then it will hit the field in
production releases with reasonable buildfarm testing but little more.
That's also pretty scary.

On balance I'm tempted to push it now for beta2, but it's a close call.
Thoughts?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2017-07-10 14:19:49 Re: retry shm attach for windows (WAS: Re: OK, so culicidae is *still* broken)
Previous Message Marina Polyakova 2017-07-10 13:21:28 Re: WIP Patch: Pgbench Serialization and deadlock errors