Re: retry shm attach for windows (WAS: Re: OK, so culicidae is *still* broken)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: retry shm attach for windows (WAS: Re: OK, so culicidae is *still* broken)
Date: 2017-06-06 02:10:34
Message-ID: CAA4eK1JcO=vhwTrRk7GO1F9b+U-BBkCO=7cKD6SHjQnRXWkUAA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 5, 2017 at 7:26 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
>> Sure. I think it is slightly tricky because specs don't say clearly
>> how ASLR can impact the behavior of any API and in my last attempt I
>> could not reproduce the issue.
>
>> I can try to do basic verification with the patch you have proposed,
>> but I fear, to do the actual tests as suggested by you is difficult as
>> it is not reproducible on my machine, but I can still try.
>
> Yeah, being able to reproduce the problem reliably enough to say whether
> it's fixed or not is definitely the sticking point here.

Agreed. By the way, while browsing about this problem, I found that
one other open source (nginx) has used a solution similar to what
Andres was proposing upthread to solve this problem. Refer:
https://github.com/nginx/nginx/commit/2ec8cfcd7d34415a99c3f3db3024ae954c00d0dd

Just to be clear, by above, I don't mean to say that if some other
open source is using some solution, we should also use it, but I think
it is worth considering (especially if it is a proven solution - just
saying based on the time (2015) it has been committed and sustained in
the code).

> I have some
> ideas about that:
>
> 1. Be sure to use Win32 not Win64 --- the odds of a failure in the larger
> address space are so small you'd never prove anything. (And of course
> it has to be a version that has ASLR enabled.)
>

I don't have access to Win32, so I request others who are reading this
and have access to Win32 to see if they can help in reproducing the
problem.

> 2. Revert 7f3e17b48 so that you have an ASLR-enabled build.
>
> 3. Crank shared_buffers to the maximum the machine will allow, reducing
> the amount of free address space and improving the odds of a collision.
>
> 4. Spawn lots of sessions --- pgbench with -C option might be a useful
> testing tool.
>

All are very good points and I think together they will certainly
improve the chances of reproducing this problem.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-06-06 02:19:34 inconsistent application_name use in logical workers
Previous Message Peter Eisentraut 2017-06-06 01:46:48 Re: ALTER SUBSCRIPTION ..SET PUBLICATION <no name> refresh is not throwing error.