Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)
Date: 2018-03-17 18:20:26
Message-ID: 8181.1521310826@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
> Not sure, but the backbranches seem to be working fine, and the commit
> that triggers the issue is from December 31. Maybe the issue was there
> but we were lucky not to trip on it before.

Yeah, we were simply not testing that overflow-detection code before.
Undoubtedly it would fail in the back branches too if we tested it.

> Anyway, I can confirm that the fix suggested by Tom does the trick
> (well, at least on Fulmar, which is running icc 14.0.3). I've also
> disassembled exec_stmt_fori both with and without the patch - reading
> assembly in not my strength, but if you're interested it's attached. The
> interesting part seems to be the last ~50 lines or so.

Hm, did you get the "master" and "patched" versions backwards? The
allegedly-patched version does the !reverse case like this:

0x00007f71219457ae <+2200>: mov -0x108(%rbp),%eax
0x00007f71219457b4 <+2206>: test %eax,%eax
0x00007f71219457b6 <+2208>: jl 0x7f71219457cf <exec_stmt_fori+2233>
0x00007f71219457b8 <+2210>: mov -0x108(%rbp),%eax
0x00007f71219457be <+2216>: add -0x110(%rbp),%eax
0x00007f71219457c4 <+2222>: mov %eax,-0x110(%rbp)
0x00007f71219457ca <+2228>: jmpq 0x7f7121945573 <exec_stmt_fori+1629>

so that it's apparently optimized

if ((int32) (loop_value + step_value) < loop_value)
break;

into

if (step_value < 0)
break;

which of course never exits the loop. That's slightly different
(and stupider) than what I'd been hypothesizing, but it's a valid
transformation if you ignore the possibility of integer overflow.

It might be worth studying the icc manual to see if it has an
equivalent of -fwrapv. Although we can and probably should fix
this case by changing the code, I'm worried about what other bugs
may exist only in icc builds. I know Andres would like to get
rid of the need for -fwrapv but I suspect that's not really going
to happen soon.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2018-03-17 18:28:22 Re: MCV lists for highly skewed distributions
Previous Message Andres Freund 2018-03-17 18:04:36 Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)