Re: Have I found an interval arithmetic bug?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Bryn Llewellyn <bryn(at)yugabyte(dot)com>, pgsql-hackers list <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Have I found an interval arithmetic bug?
Date: 2021-07-28 16:32:03
Message-ID: CA+TgmoZbuxd=+oLHAB5iZVA3yGkyfkJXhqS4E1vGe9QcK3BE3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Wed, Jul 28, 2021 at 11:52 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> You know, I was thinking exactly that thing earlier. Changing the
> on-disk representation is certainly a nonstarter, but the problem
> here stems from expecting interval_in to do something sane with inputs
> that do not correspond to any representable value. I do not think we
> have any other datatypes where we expect the input function to make
> choices like that.

It's not exactly the same issue, but the input function whose behavior
most regularly trips people up is bytea, because they try something
like 'x'::bytea and it seems to DWTW so then they try it on all their
data and discover that, for example, '\'::bytea fails outright, or
that ''::bytea = '\x'::bytea, contrary to expectations. People often
seem to think that casting to bytea should work like convert_to(), but
it doesn't. As in the case at hand, byteain() has to guess whether the
input is intended to be the 'hex' or 'escape' format, and because the
'escape' format looks a lot like plain old text, confusion ensues.
Now, guessing between two input formats that are both legal for the
data type is not exactly the same as guessing what to do with a value
that's not directly representable, but it has the same ultimate effect
i.e. the human beings perceive the system as buggy.

A case that is perhaps more theoretically similar to the instance at
hand is rounding during the construction of floating point values. My
system thinks '1.00000000000000000000000001'::float = '1'::float, so
in that case, as in this one, we've decided that it's OK to build an
inexact representation of the input value. I don't really see what can
be done about this considering that the textual representation uses
base 10 and the internal representation uses base 2, but I think this
doesn't cause us as many problems in practice because people
understand how it works, which doesn't seem to be the case with the
interval data type, at last if this thread is any indication.

I am dubious that it's worth the pain of making the input function
reject cases involving fractional units. It's true that some people
here aren't happy with the current behavior, but they may no happier
if we reject those cases with an error, and other people may then be
unhappy too. I think your previous idea was the best one so far: fix
the input function so that 'X years Y months' and 'Y months X years'
always produce the same answer, and call it good.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Matthias Apitz 2021-07-28 16:58:08 Re: PostgreSQL reference coffee mug
Previous Message Tom Lane 2021-07-28 15:52:34 Re: Have I found an interval arithmetic bug?

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2021-07-28 16:39:42 Re: Why don't update minimum recovery point in xact_redo_abort
Previous Message Andrew Dunstan 2021-07-28 16:25:22 Re: Out-of-memory error reports in libpq