Re: Test suite fails on alpha architecture

From: Martin Pitt <martin(at)piware(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Steve Langasek <vorlon(at)debian(dot)org>, José Luis Rivero (yoswink) <yoswink(at)gentoo(dot)org>, debian-alpha(at)lists(dot)debian(dot)org, pgsql-bugs(at)postgreSQL(dot)org
Subject: Re: Test suite fails on alpha architecture
Date: 2007-12-04 22:43:40
Message-ID: 20071204224340.GH6765@piware.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

Tom Lane [2007-11-07 13:49 -0500]:
> All the other diffs that Martin showed are divide-by-zero failures,
> and I do not see any of them on Gentoo's machine. I think that this
> must be a compiler bug. The first example in his diffs is just
> "select 1/0", which executes this code:
>
> int32 arg1 = PG_GETARG_INT32(0);
> int32 arg2 = PG_GETARG_INT32(1);
> int32 result;
>
> if (arg2 == 0)
> ereport(ERROR,
> (errcode(ERRCODE_DIVISION_BY_ZERO),
> errmsg("division by zero")));
>
> result = arg1 / arg2;
>
> It looks to me like Debian's compiler must be allowing the division
> instruction to be speculatively executed before the if-test branch
> is taken. Perhaps it is supposing that this is OK because control
> will return from ereport(), when in fact it will not (the routine
> throws a longjmp). Since we've not seen such behavior on any other
> platform, however, I suspect this is just a bug and not intentional.

I tried this on a Debian Alpha porter box (thanks, Steve, for pointing
me at it) with Debian's gcc 4.2.2. Latest sid indeed still has this
bug (the floor() one is confirmed fixed), not only on Alpha, but also
on sparc.

Since the simple test case did not reproduce the error, I tried to
make a more sophisticated one which resembles more closely what
PostgreSQL does (sigsetjmp/siglongjmp instead of exit(), some macros,
etc.). Unfortunately in vain, since the test case still works
perfectly with both no compiler options and also the ones used for
PostgreSQL. I attach it here nevertheless just in case someone has
more luck than me.

So I tried to approach it from the other side: Building postgresql
with CFLAGS="-O0 -g" or "-O1 -g" works correctly, but with "-O2 -g" I
get above bug.

So I guess I'll build with -O1 for the time being on sparc and alpha
to get correct binaries until this is sorted out. Any idea what else I
could try?

Thanks,

Martin

--
Martin Pitt http://www.piware.de
Ubuntu Developer http://www.ubuntu.com
Debian Developer http://www.debian.org

Attachment Content-Type Size
div.c text/x-csrc 1.0 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Martin Pitt 2007-12-04 22:58:48 Re: Test suite fails on alpha architecture
Previous Message Alvaro Herrera 2007-12-04 21:35:22 Re: BUG #3790: pg_restore error canceling statement due to user request