Re: Crash with old Windows on new CPU

From: Christian Ullrich <chris(at)chrullrich(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Crash with old Windows on new CPU
Date: 2016-02-13 15:45:33
Message-ID: AM2PR06MB069042775AAB1D290D95F957D4AA0@AM2PR06MB0690.eurprd06.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On February 13, 2016 4:10:34 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Christian Ullrich <chris(at)chrullrich(dot)net> writes:
>> * Robert Haas wrote:
>>> Thanks for the report and patch. Regrettably I haven't the Windows
>>> knowledge to have any idea whether it's right or wrong, but hopefully
>>> someone who knows Windows will jump in here.
>
>> In commitfest now.
>
> FWIW, I'm a tad suspicious of the notion that it's our job to make this
> case work. How practical is it really to run a Windows release on
> unsupported-by-Microsoft hardware --- aren't dozens of other programs
> going to have the same issue?

Why would the hardware be unsupported? The problem occurs on new CPUs, not old ones, and even the OS (2008) is still in extended support until next year, IIRC.

> I'm also suspicious of the "#if _MSC_VER == 1800" tests, that is,
> the code compiles on *exactly one* MSVC version.

The bug exists in only that compiler version's CRT, also, that is not the complete version number. There may be different builds somewhere, but they all start with 18.0.

After all, MS is in the business of selling new compilers, not maintaining the old ones.

> Maybe that's actually
> what's needed, but it sure looks fishy. And what connection does the
> build toolchain version have to the runtime environment anyway?

The CRT version is tied to the compiler version. It has mainly to do with matching allocators.

> Likewise, how can we know that !IsWindows7SP1OrGreater() is the exactly
> right runtime test?

Because all sources, including Microsoft, say that AVX2 support was added in 7SP1.

> Lastly, I'd like to see some discussion of what side effects
> "_set_FMA3_enable(0);" has ... I rather doubt that it's really
> a magic-elixir-against-crashes-with-no-downsides.

It tells the math library (in the CRT, no separate libm on Windows) not to use the AVX2-based implementations of log() and possibly other functions. AIUI, FMA means "fused multiply-add" and is apparently something that increases performance and accuracy in transcendental functions.

I can check the CRT source later today and figure out exactly what it does.

Also, if you look at the link I sent, you will find that a member of the Visual C++ Libraries team at MS is the source for the workaround. They probably know what they are doing, present circumstances excepted.

> That would
> give us some context to estimate the risks of this code executing
> when it's not really needed.

Hence all the conditions. The problem is *certain* to occur under these specific conditions (x64 code on Windows before 7SP1 on a CPU with AVX2 when built with VS2013), and under no others, and these conditions flip the switch exactly then.

> Without that, I'd be worried that
> this cure is worse than the disease because it breaks cases that
> weren't broken before.

Isn't that what the buildfarm is (among other things) for?

--
Christian Ullrich

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-02-13 15:52:35 Re: Defaults for replication/backup
Previous Message Fabien COELHO 2016-02-13 15:37:00 Re: extend pgbench expressions with functions