Re: Re: [GENERAL] +/- Inf for float8's

From: "Ross J(dot) Reedstrom" <reedstrm(at)rice(dot)edu>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>, pgsql-hackers(at)hub(dot)org
Subject: Re: Re: [GENERAL] +/- Inf for float8's
Date: 2000-08-20 22:08:28
Message-ID: 20000820170828.A31805@rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Aug 20, 2000 at 12:33:00AM +0200, Peter Eisentraut wrote:
<snip side comment about bug tracking. My input: for an email controllable
system, take a look at the debian bug tracking system>

> Show me a system where it doesn't work and we'll get it to work.
> UNSAFE_FLOATS as it stands it probably not the most appropriate behaviour;
> it intends to speed things up, not make things portable.
>

I agree. In the previous thread on this, Thomas suggested creating a flag
that would allow control turning the CheckFloat8Val function calls into
a macro NOOP. Sound slike a plan to me.

>
> > > NULL and NaN are not quite the same thing imho. If we are allowing NaN
> > > in columns, then it is *known* to be NaN.
> >
> > For the purposes of ordering, however, they are very similar.
>
> Then we can also treat them similar, i.e. sort them all last or all first.
> If you have NaN's in your data you wouldn't be interested in ordering
> anyway.

Right, but the problem is that NULLs are an SQL language feature, and
there for rightly special cased directly in the sorting apparatus. NaN is
type specific, and I'd be loath to special case it in the same place. As
it happens, I've spent some time this weekend groveling through the sort
(and index, as it happens) code, and have an idea for a type specific fix.

Here's the deal, and an actual, honest to goodness bug in the current code.

As it stands, we allow one non-finite to be stored in a float8 field:
NaN, with partial parsing of 'Infinity'.

As I reported last week, NaNs break sorts: they act as barriers, creating
sorted subsections in the output. As those familiar with the code have
already guessed, there is a more serious bug: NaNs break indicies on
float8 fields, essentially chopping the index off at the first NaN.

Fixing this turns out to be a one liner to btfloat8cmp.

Fixing sorts is a bit tricker, but can be done: Currently, I've hacked
the float8lt and float8gt code to sort NaN to after +/-Infinity. (since
NULLs are special cased, they end up sorting after NaN). I don't see
any problems with this solution, and it give the desired behavior.

I've attached a patch which fixes all the sort and index problems, as well
as adding input support for -Infinity. This is not a complete solution,
since I haven't done anything with the CheckFloat8Val test. On my
system (linux/glibc2.1) compiling with UNSAFE_FLOATS seems to work fine
for testing.

>
> Side note 2: The paper "How Java's floating point hurts everyone
> everywhere" provides for good context reading.

http://http/cs.berkeley.edu/~wkahan/JAVAhurt.pdf ? I'll take a look at it
when I get in to work Monday.

>
> Side note 3: Once you read that paper you will agree that using floating
> point with Postgres is completely insane as long as the FE/BE protocol is
> text-based.

Probably. But it's not our job to enforce sanity, right? Another way to think
about it is fixing the implementation so the deficiencies of the FE/BE stand
out in a clearer light. ;-)

Ross
--
Ross J. Reedstrom, Ph.D., <reedstrm(at)rice(dot)edu>
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St., Houston, TX 77005

Attachment Content-Type Size
float-fix.diff text/plain 1.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message The Hermit Hacker 2000-08-20 22:48:23 Re: How Do You Pronounce "PostgreSQL"?
Previous Message Hannu Krosing 2000-08-20 21:22:11 Re: Optimisation deficiency: currval('seq')-->seq scan, constant-->index scan