Re: WIP: rewrite numeric division

From: Michael Paesold <mpaesold(at)gmx(dot)at>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-patches(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: WIP: rewrite numeric division
Date: 2007-07-17 08:23:21
Message-ID: 469C7C79.1090005@gmx.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Please, let's revisit this, and not postpone it without further
discussion. I never knew about the correctness issues in div_var(), but
since I know about it, I feel I am just waiting until that problem will
hit me or anyone else.
So can you, Tom, please describe in what situations the old code is
really unsafe?

We usually *round* all values to at maximum 4 decimal places -- are we
on the save side? Does this only affect high precision division, or any
divisions?

Best Regards
Michael Paesold

Bruce Momjian wrote:
> Because this has not been applied, this has been saved for the 8.4 release:
>
> http://momjian.postgresql.org/cgi-bin/pgpatches_hold
>
> ---------------------------------------------------------------------------
>
> Tom Lane wrote:
>> I wrote:
>>> I just blew the dust off my old copy of Knuth vol 2, and see that his
>>> algorithm for multi-precision division generates output digits that are
>>> correct to start with (or at least he never needs to revisit a digit
>>> after moving on to the next). ISTM we should go over to an approach
>>> like that.
>> The attached proposed patch rewrites div_var() using Knuth's algorithm,
>> meaning that division should always produce an exact truncated output
>> when asked to truncate at X number of places. This passes regression
>> tests and fixes both of the cases previously exhibited:
>> http://archives.postgresql.org/pgsql-bugs/2007-06/msg00068.php
>> http://archives.postgresql.org/pgsql-general/2005-05/msg01109.php
>>
>> The bad news is that it's significantly slower than the CVS-HEAD code;
>> it appears that for long inputs, div_var is something like 4X slower
>> than before, depending on platform. The numeric_big regression test
>> takes about twice as long as before on one of my machines, and 50%
>> longer on another. This is because the innermost loop now involves
>> integer division, which it didn't before. (According to oprofile,
>> just about all the time goes into the loop that subtracts qhat * divisor
>> from the working dividend, which is what you'd expect.)
>>
>> Now it's unlikely that real-world applications are going to be as
>> dependent on the speed of div_var for long inputs as numeric_big is.
>> And getting the right answer has to take priority over speed anyway.
>> Still this is a bit annoying. Anyone see a way to speed it up, or
>> have another approach?
>>
>> regards, tom lane
>>
>
> Content-Description: numeric-div.patch.gz
>
> [ Type application/octet-stream treated as attachment, skipping... ]

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2007-07-17 15:21:05 Re: WIP: rewrite numeric division
Previous Message Bruce Momjian 2007-07-17 05:31:02 Re: Synchronous Commit Doc Patch