Adding IEEE 754:2008 decimal floating point and hardware support for it

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Adding IEEE 754:2008 decimal floating point and hardware support for it
Date: 2013-06-11 23:56:34
Message-ID: 51B7B932.3000407@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all

Currently DECIMAL is an alias for NUMERIC, Pg's built-in arbitrary
precision and scale decimal type. I'd like to explore the possibility of
using hardware decimal floating point support in newer processors,
compilers and C libraries to enhance DECIMAL / NUMERIC performance.

With the advent of _Decimal32, _Decimal64 and _Decimal128 support in
IEEE 754:2008 as supported in gcc in <float.h> TR24732 we have the
opportunity to make use of hardware representations of decimal floating
point values and hardware implementations of operations on them, gaining
a potentially huge performance boost in exchange for more limited
precision and scale. I'd like to gather ideas and suggestions about how
we might approach this.

The main thing I'm wondering is how/if to handle backward compatibility
with the existing NUMERIC and its DECIMAL alias, or whether adding new
DECIMAL32, DECIMAL64, and DECIMAL128 types would be more appropriate.
I'd love to just use the SQL standard types name DECIMAL if possible,
and the standard would allow for it (see below), but backward compat
would be a challenge, as would coming up with a sensible transparent
promotion scheme from 32->64->128->numeric and ways to stop undesired
promotion.

What I'm currently thinking of is using the same strategy we use right
now for FLOAT(n) where we select between float4 and float8 based on the
specified precision. We could do the same for DECIMAL; up to
DECIMAL(94,7) would become decimal32; up to DECIMAL(382,16) would become
decimal64 and DECIMAL128(34,6142); everything higher would become
NUMERIC as currently. NUMERIC would be unaffected. (Ideally we wouldn't
have to do the type change in the parser hack but that's not really
possible so long as Pg doesn't preserve typmods in calculations and
intermediate outputs).

According to TR24732 (
http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1312.pdf) the _Decimal
family of types offer:

_Decimal32: 7 coefficient digits, 10^97 to 10^-94 range.
_Decimal64: 16 coefficient digits, 10^385 to 10^-382 range.
_Decimal128 34 coefficient digits, 10^6145 to 10^-6142 range.

There was a thread about this on -general some time ago:


http://www.postgresql.org/message-id/4CB26B16.7080602@postnewspapers.com.au

that never went anywhere. Other discussion is mostly of use cases and is
more hypothetical, but outlines why they'd be useful:


http://www.postgresql.org/message-id/CADLWmXVmne9t5x-hR-XGOxEyOWQX5BfZwc9Qb=xhsJ_gkG_AaQ@mail.gmail.com

In terms of how they fit in to the standard, the copy of the SQL:2008
draft I have here says:

* NUMERIC specifies the data type exact numeric, with the decimal
precision and scale specified by the
<precision> and <scale>.
* DECIMAL specifies the data type exact numeric, with the decimal scale
specified by the <scale> and
the implementation-defined decimal precision equal to or greater than
the value of the specified <precision>.

Additionally:

* For the <exact numeric type>s DECIMAL and NUMERIC, the maximum values
of <precision> and
of <scale> are implementation-defined.

... so it seems we'd be free to use the hardware types and could
possibly internally promote from smaller to larger decimal types as
appropriate.

My main concern is that even the largest fixed-size decimal can't store
the arbitrary precision and scale supported by Pg's NUMERIC, and people
are used to using DECIMAL as an alias for NUMERIC. We're in a bit of a
BC trap, as the spec would allow us to just use the hardware types, but
we've already provided support for a nearly unlimited (but much, much
slower) type using the same name.

Regarding the breadth of support, it looks like we could use gcc's
built-in types if available, and otherwise fall back to one of the
portable decimal floating point maths libraries.

gcc 4.3 added _Decimal types, see
http://gcc.gnu.org/gcc-4.3/changes.html, so it's ancient history for
gcc. Hardware support isn't required; in fact the real question would be
whether gcc actually uses the hardware operations where they're
supported. I'll need to dig into that.

For Windows, MSVC doesn't support the types as built-ins. There's the
Intel Decimal Floating Point Library
(http://software.intel.com/en-us/articles/intel-decimal-floating-point-math-library,
also available from http://www.netlib.org/misc/intel/) to provide
support as a library. IBM's "decnumber" libraray is another possible
alternative.

LLVM's clang does not support those types according to the manual:
"clang does not support decimal floating point types (_Decimal32 and
friends) or fixed-point types (_Fract and friends); nobody has expressed
interest in these features yet, so it's hard to say when they will be
implemented." (http://clang.llvm.org/docs/UsersManual.html). The Intel
library or IBM decnumber should work, but would need to be checked.

See
http://www.ac.usc.es/arith19/sites/default/files/3670a225-spec-session-DFP-paper2.pdf
for (somewhat old) details on Intel processor support. There's also
support in POWER6 and System z10 according to comments on the Intel
library article.

So ... thoughts/comments? Think this is a reasonable avenue to pursue?

I'd be inclined to start by adding basic DECIMAL32, DECIMAL64 and
DECIMAL128 support with an implicit promotion to NUMERIC available.
After that the idea would be to progressively add operators and
functions that worked directly on these types, letting the numeric
versions handle what wasn't implemented for decimal yet. Finally I'd
want to change the parser's interpretation of qualified DECIMAL to
translate to DECIMAL32/64/128 as appropriate.

This isn't work I have any funded time for, so to the degree I could do
it at all it'd be something I'd be taking on as a project out of
personal interest and for learning. That means "not fast".

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2013-06-12 00:18:24 Re: JSON and unicode surrogate pairs
Previous Message Tatsuo Ishii 2013-06-11 23:44:55 Re: Parallell Optimizer