Re: Bug in numeric multiplication

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bug in numeric multiplication
Date: 2015-11-18 15:20:56
Message-ID: 13642.1447860056@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> writes:
> On 17 November 2015 at 23:57, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I'm not sure that it's provably okay though. The loose ends are to show
>> that div[qi] can't overflow an int during the divisor-subtraction step
>> and that "outercarry" remains bounded. Testing suggests that outercarry
>> can't exceed INT_MAX/NBASE, but I don't see how to prove that.

> At the top of the loop we know that
> Abs(div[qi]) <= maxdiv * (NBASE-1)
> <= INT_MAX - INT_MAX/NBASE - 1
> Then we add Abs(qdigit) to maxdiv which potentially triggers a
> normalisation step. That step adds carry to div[qi], and we know that
> Abs(carry) <= INT_MAX/NBASE + 1
> so the result is that Abs(div[qi]) <= INT_MAX -- i.e., it can't
> overflow at that point, but it could be on the verge of doing so.

Right.

> Then the divisor-subtraction step does
> div[qi] -= qdigit * var2digits[0]
> That looks as though it could overflow, although actually I would
> expect qdigit to have the same sign as div[qi], so that this would
> drive div[qi] towards zero.

But if outercarry is nonzero, then qdigit is going to be primarily driven
by outercarry not div[qi], so I'm afraid it's mistaken to rely on it
having the right sign to cause cancellation.

> If you didn't want to rely on that though,
> you could take advantage of the fact that this new value of div[qi] is
> only needed for outercarry, so you could modify the
> divisor-subtraction step to skip div[qi]:
> and fold the most significant digit of the divisor-subtraction step
> into the computation of outercarry

Cute idea. Another thought I'd had is that we could change the carry
propagation loop limits so that div[qi] is normalized along with the other
digits. Then we'd have a carry leftover at the end of the loop, but we
could add it to outercarry. That makes the argument that div[qi] does
not overflow the same as for the other digits.

However, I kind of like your idea of postponing the div[qi] subtraction
to the outercarry update, because then it's very direct to see that
that update should result in near-total cancellation.

> so outercarry ought to be driven towards zero, as the comment says. It
> certainly doesn't look like it will grow without bounds, but I haven't
> attempted to work out what its bounds actually are.

I'm kind of stuck on that too. I did some experimentation by tracking
maximum values of outercarry in the regression tests (including
numeric_big) and did not see it get larger than a couple hundred thousand,
ie more or less INT_MAX/NBASE. But I don't have an argument as to why
that would be the limit.

Another issue here is that with outercarry added into the qdigit
computation, it's no longer clear what the bounds of qdigit itself are,
so is it really safe to assume that "div[qi + i] -= qdigit * var2digits[i]"
can't overflow? Stated in other terms, why are we sure that the new
maxdiv value computed at the bottom of the carry propagation stanza is
within the safe range?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2015-11-18 15:21:45 Re: Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
Previous Message Tom Lane 2015-11-18 14:35:09 Re: Feature or bug: getting "Inf"::timestamp[tz] by "regular" value