Re: [PATCH] Fix old thinko in formula to compute sweight in numeric_sqrt().

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Joel Jacobson <joel(at)compiler(dot)org>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Fix old thinko in formula to compute sweight in numeric_sqrt().
Date: 2023-01-29 13:33:22
Message-ID: CAEZATCXzNvRGXNYkPD4mR23UK1DW+X5dUMMWutL_h=-b5M7Z2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 28 Jan 2023 at 22:14, Joel Jacobson <joel(at)compiler(dot)org> wrote:
>
> I found what appears to be a small harmless error in numeric.c,
> that seems worthwhile to fix only because it's currently causes confusion.
>

Shrug. Looking at git blame, it's been like that for about 20 years,
and I wasn't aware of it causing confusion.

> HEAD, patched:
> sweight = (arg.weight * DEC_DIGITS) / 2 + 1
>

You haven't actually said why this formula is more correct than the
current one. I believe that it is when arg.weight >= 0, but I'm not
convinced it's correct for arg.weight < 0.

Given that this is only an approximate computation, which ensures
roughly 16 significant digits in the result, but can't guarantee
exactly 16 digits, I'm not convinced of the benefits of changing it.

> Note, however, that it's still possible to find examples of when sqrt(numeric)
> produce results with different precision for different DEC_DIGITS/NBASE values,
> but in such cases, it's intentional, and due to getting additional precision
> for free, since the larger the NBASE, the more decimal digits are produced
> at the same time per iteration in the calculation.
>
> Example:
>
> HEAD, unpatched
> DEC_DIGITS sqrt(102::numeric)
> 4 10.09950493836208
> 2 10.099504938362078
> 1 10.0995049383620780
>
> HEAD, patched:
> DEC_DIGITS sqrt(102::numeric)
> 4 10.099504938362078
> 2 10.09950493836208
> 1 10.09950493836208
>
> According to the comment in numeric_sqrt(), the goal is to give at least
> NUMERIC_MIN_SIG_DIGITS (16) significant digits.
>
> Since 10.09950493836208 has 16 significant digits, we can see above how
> DEC_DIGITS==2 causes an additional unnecessary significant digit to be computed,
> and for DEC_DIGITS==1, two additional unnecessary significant digits are
> computed.
>
> The patched version returns 16 significant digits as expected for DEC_DIGITS==2
> and DEC_DIGITS==1, and for DEC_DIGITS==4 we get an additional digit for free.
>

You lost me here. In unpatched HEAD, sqrt(102::numeric) produces
10.099504938362078, not 10.09950493836208 (with DEC_DIGITS = 4). And
wasn't your previous point that when DEC_DIGITS = 4, the new formula
is the same as the old one?

> In conclusion, the proposed patch fixes a harmless problem, but is important
> to fix, since otherwise, anyone who want to experiment with different
> DEC_DIGITS/NBASE combinations by changing the `#if 0` preprocessor values
> in the top of numeric.c will get surprising results from sqrt().
>

Anyone changing DEC_DIGITS/NBASE will find hundreds of regression test
failures due to changes in result precision all over the place (and
failures due to the overflow limit changing). I don't see why
numeric_sqrt() should be singled out for fixing.

> In passing, also add pow10[] values for DEC_DIGITS==2 and DEC_DIGITS==1,
> since otherwise it's not possible to compile such DEC_DIGITS values
> due to the assert:
>
> StaticAssertDecl(lengthof(pow10) == DEC_DIGITS, "mismatch with DEC_DIGITS");
>

That might be worth doing, to ensure that the code still compiles for
other DEC_DIGITS/NBASE values. I'm not sure how useful that really is
anymore though. As the comment at the top says, it's kept mostly for
historical reasons.

Regards,
Dean

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2023-01-29 14:49:56 Re: [PoC] Improve dead tuple storage for lazy vacuum
Previous Message Dmitry Dolgov 2023-01-29 13:32:19 Re: pg_stat_statements and "IN" conditions