Re: Zero-padding and zero-masking fixes for to_char(float)

From: Jeff Anton <antonpgsql(at)hesiod(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Zero-padding and zero-masking fixes for to_char(float)
Date: 2015-03-24 16:26:10
Message-ID: 55119022.3020509@hesiod.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The issue of significant (decimal) digits to and from floating point
representation is a complex one.

What 'significant' means may depend upon the intent.

There are (at least) two different tests which may need to be used.

* How many digits can be stored and then accurately returned?
or
* How many decimal digits are needed to recreate a floating point value?
Or in longer form, if you have a floating point value, you may want to
print it in decimal form and then later scan that to recreate the exact
bit pattern from the original floating point value. How many decimal
digits do you need?

The first question produces a smaller number of digits then the second one!

The idea of zero padding is, IMO, a bad idea all together. It makes
people feel better, but it adds inaccuracy. I've lost this
interpretation so many times now that I only mention it for the real
number geeks out there.

Postgresql seems to be using the first interpretation and reporting
fewer digits. I've noticed this with pg_dump. That a dump and restore
of floating point values does not produce the same floating point
values. To me, that is inexcusable. Using the -Fc format, real values
are preserved. I have a large database of security prices. I want
accuracy above all.

I do not have time right now to produce the needed evidence for all
these cases of floating point values. If there is interest I can
produce this in a day or so.

Jeff Anton

BTW: This is my first posting to this list. I should introduce myself.
I'm Jeff Anton. I was the first Postgres project lead programmer
working for Michael Stonebraker at U.C. Berkeley a very long time ago.
The first version was never released. I've since worked for several db
companies.

On 03/24/15 06:47, Noah Misch wrote:
> On Sun, Mar 22, 2015 at 10:53:12PM -0400, Bruce Momjian wrote:
>> On Sun, Mar 22, 2015 at 04:41:19PM -0400, Noah Misch wrote:
>>> On Wed, Mar 18, 2015 at 05:52:44PM -0400, Bruce Momjian wrote:
>>>> This "junk" digit zeroing matches the Oracle behavior:
>>>>
>>>> SELECT to_char(1.123456789123456789123456789d, '9.9999999999999999999999999999999999999') as x from dual;
>>>> ------
>>>> 1.1234567891234568000000000000000000000
>>>>
>>>> Our output with the patch would be:
>>>>
>>>> SELECT to_char(float8 '1.123456789123456789123456789', '9.9999999999999999999999999999999999999');
>>>> ------
>>>> 1.1234567891234500000000000000000000000
>
>>> These outputs show Oracle treating 17 digits as significant while PostgreSQL
>>> treats 15 digits as significant. Should we match Oracle in this respect while
>>> we're breaking compatibility anyway? I tend to think yes.
>>
>> Uh, I am hesistant to adjust our precision to match Oracle as I don't
>> know what they are using internally.
>
> http://sqlfiddle.com/#!4/8b4cf/5 strongly implies 17 significant digits for
> float8 and 9 digits for float4.
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2015-03-24 16:47:32 Re: Zero-padding and zero-masking fixes for to_char(float)
Previous Message David Steele 2015-03-24 15:38:49 Re: Auditing extension for PostgreSQL (Take 2)