Re: Bug in numeric multiplication

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bug in numeric multiplication
Date: 2015-11-18 10:33:19
Message-ID: CAEZATCU6Ap9QYxtGzZC6KU89W0vuOBGMXhwNFew5SWSAdLp1kw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 17 November 2015 at 23:57, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> After thinking some more about what this is doing, it seems to me that
> we could avoid changing div[qi + 1] at all here, and instead deal with
> leftover dividend digits by shoving them into the floating-point part of
> the calculation. All that we really need to have happen as a consequence
> of the above is to inject an additional "div[qi] * NBASE" term into future
> calculations of fdividend. So we can do that out-of-band and avoid
> mucking up the maxdiv invariant.
>

That makes sense.

> Also, I realized that this bit:
>
> /*
> * All the div[] digits except possibly div[qi] are now in the
> * range 0..NBASE-1.
> */
> maxdiv = Abs(newdig) / (NBASE - 1);
> maxdiv = Max(maxdiv, 1);
>
> is unduly conservative, because, while indeed div[qi] might be out of
> range, there is no need to consider it in maxdiv anymore, because the new
> maxdiv value only determines what will happen in future loop iterations
> where the current div[qi] will be irrelevant. So we should just reset
> maxdiv to 1 unconditionally here.
>

OK, makes sense too.

> So that led me to the attached patch, which passes regression tests fine.

I'd feel better about that if the regression tests actually did
something that stressed this, but coming up with such a test is
non-trivial.

> I'm not sure that it's provably okay though. The loose ends are to show
> that div[qi] can't overflow an int during the divisor-subtraction step
> and that "outercarry" remains bounded. Testing suggests that outercarry
> can't exceed INT_MAX/NBASE, but I don't see how to prove that.
>

At the top of the loop we know that

Abs(div[qi]) <= maxdiv * (NBASE-1)
<= INT_MAX - INT_MAX/NBASE - 1

Then we add Abs(qdigit) to maxdiv which potentially triggers a
normalisation step. That step adds carry to div[qi], and we know that

Abs(carry) <= INT_MAX/NBASE + 1

so the result is that Abs(div[qi]) <= INT_MAX -- i.e., it can't
overflow at that point, but it could be on the verge of doing so.

Then the divisor-subtraction step does

div[qi] -= qdigit * var2digits[0]

That looks as though it could overflow, although actually I would
expect qdigit to have the same sign as div[qi], so that this would
drive div[qi] towards zero. If you didn't want to rely on that though,
you could take advantage of the fact that this new value of div[qi] is
only needed for outercarry, so you could modify the
divisor-subtraction step to skip div[qi]:

for (i = 1; i < istop; i++)
div[qi + i] -= qdigit * var2digits[i];

and fold the most significant digit of the divisor-subtraction step
into the computation of outercarry so that there is definitely no
possibility of integer overflow:

outercarry = outercarry * NBASE + (double) div[qi]
- (double) qdigit * var2digits[0];

It's still difficult to prove that outercarry remains bounded, but to
a first approximation

qdigit ~= ( outercarry * NBASE + (double) div[qi] ) / var2digits[0]

so outercarry ought to be driven towards zero, as the comment says. It
certainly doesn't look like it will grow without bounds, but I haven't
attempted to work out what its bounds actually are.

Regards,
Dean

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2015-11-18 10:50:17 Re: Proposal: "Causal reads" mode for load balancing reads without stale data
Previous Message Konstantin Knizhnik 2015-11-18 10:18:39 SPI and transactions