Followup on the UnixWare Optimizer bug.

From: "Larry Rosenman" <ler(at)lerctr(dot)org>
To: "'PostgreSQL-development'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Followup on the UnixWare Optimizer bug.
Date: 2005-08-25 00:16:57
Message-ID: E1E85Qg-000082-1T@lerami.lerctr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

The following is from my SCO Internal contact about the bug. It's
definitely their bug. Towards the end of the
Exact diagnosis, is a suggested work-around for now, as well as a (possible)
memory leak.

-----
Dave and I were convinced that the CSE optimization was correct and
manufactured data that we pushed through the interval_div() functions with
and without the CSE optimization was producing the same results.

It was only when we identified the exact input values from the interval
regression test were we able to see the problem

With span->month = 493
span->day = 11
factor = 10;

The CSE optimization problemd occures with the following code from
interval_div() and I am assuming also in interval_mul()

/* Cascade fractions to lower units */
/* fractional months full days into days */
result->day += month_remainder * DAYS_PER_MONTH;
/* fractional months partial days into time */
day_remainder += (month_remainder * DAYS_PER_MONTH) -
(int)(month_remainder *
DAYS_PER_MONTH);

At the point of failure:

month_remainder = 49.3 - 49 = .3 (INEXACT) and represented in the
80 bit FP register as .29999999999999999

month_remainder * 30 = 8.99999999999999997 (also inexact)

That results in result->day = 8 + 1 = 9, but day_remainder is
.999999999999997 + the .1 left from the earlier division.

The later call to interval_justify_hours subtracts the 1 days worth of
seconds from time ( day_mainder * SECS_PER_DAY + time portion)......
and bumps result->day by 1 ==> 10.

The FAILURE is because the compiler is trying to reduce the 3 FP multiples
to 1 multiply; using the value 3 times. The problem occurs because the
8.999999999999997 is inexact and when the CSE values is stored as a
"double", it is rounded to 9.0.

On the 2nd and 3rd uses of the value, the 9.0 (64-bit FP data) is used; but
the 1st use is still using the 80-bit (8.9999999999999) value. So the inter
truncation still gets 8, but the calculation day_remainder now is (9.0 - 9)
+ .1 (previous day_remainder contents).

Our bug is that either:

- the CSE value should have been preserved as an 80-bit long double
since that is how the internal calculations are being done.....

This is the "correct" fix and will take us some time to make
certain that we haven't broken anything.

- the CSE value have been rounded (to 64-bit precision) before use
in all 3 points in the code.

This would have resulted in result->day = 9 + 1 = 10 and the
resuilt->time would have been correctly less than 1 day. The
interval_justify_hours() would make not adjustments and result
would be identical - as expected.

As I said, this will take us some time to work up the fix and revalidate the
compiler. Since you have release coming up, I want to suggest the follow
work-around for a Common Subexpression Elimination (CSE) bug in "some"
compiler.......

For both interval_div() and interval_mul()

double CSE;

/* Cascade fractions to lower units */
/* fractional months full days into days */
CSE = month_remainder * DAYS_PER_MONTH;
result->day += CSE;
/* fractional months partial days into time */
day_remainder += (CSE) - (int)(CSE);

Also note that there appears to be a memory leak in the interval_****
routines. For example interval_div() allocates a "result" Interval.
It eventually passes this result through to interval_justify_hours() which
allocates another Interval "result" and that "result" is what gets passed
back to caller on interval_div(). The 1st Interval allocated appears to be
left around.......
------

I will get a pre-release copy of the compiler to test, but it will take a
while, since they have to revalidate it.

Comments?

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: ler(at)lerctr(dot)org
US Mail: 3535 Gaspar Drive, Dallas, TX 75220-3611 US

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2005-08-25 00:24:42 Re: Pre-allocated free space for row updating (like PCTFREE)
Previous Message Andrew Dunstan 2005-08-25 00:08:14 Re: [HACKERS] Proposed patch to getaddrinfo.c to support

Browse pgsql-patches by date

  From Date Subject
Next Message Andrew Dunstan 2005-08-25 00:26:51 Re: [HACKERS] Proposed patch to getaddrinfo.c to support
Previous Message Andrew Dunstan 2005-08-25 00:08:14 Re: [HACKERS] Proposed patch to getaddrinfo.c to support