Re: Have I found an interval arithmetic bug?

From: Bryn Llewellyn <bryn(at)yugabyte(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, John W Higgins <wishdev(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane PostgreSQL <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Have I found an interval arithmetic bug?
Date: 2021-04-08 18:17:18
Message-ID: 4EE3B8DC-2F84-4C9B-9A99-4650562A4F84@yugabyte.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

> On 08-Apr-2021, at 10:24, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> On Mon, Apr 5, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
>> On Mon, Apr 5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
>> Well, bug or not, we are not going to change back branches for this, and
>> if you want a larger discussion, it will have to wait for PG 15.
>>
>>>> https://www.google.com/url?q=https://www.postgresql.org/docs/current/datatype-datetime.html%23DATATYPE-INTERVAL-INPUT&source=gmail-imap&ust=1618507489000000&usg=AOvVaw2h2TNbK7O41zsDn8HfD88C
>>>> « …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to the appropriate number of months, days, and seconds for storage. When this would result in a fractional number of months or days, the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24 hours. For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »
>>
>> I see that. What is not clear here is how far we flow down. I was
>> looking at adding documentation or regression tests for that, but was
>> unsure. I adjusted the docs slightly in the attached patch.
>
> Here is an updated patch, which will be for PG 15. It updates the
> documentation to state:
>
> The fractional parts are used to compute appropriate values for the next
> lower-order internal fields (months, days, seconds).
>
> It removes the flow from fractional months/weeks to
> hours-minutes-seconds, and adds missing rounding for fractional
> computations.

Thank you Bruce. I look forward to documenting this new algorithm for YugabyteDB. The algorithm implements the transformation from this:

[
yy_in numeric,
mo_in numeric,
dd_in numeric,
hh_in numeric,
mi_in numeric,
ss_in numeric
]

to this:

[
mo_internal_representation int,
dd_internal_representation int,
ss_internal_representation numeric(1000,6)
]

I am convinced that a prose account of the algorithm, by itself, is not the best way to tell the reader the rules that the algorithm implements. Rather, psuedocode is needed. I mentioned before that, better still, is actual executable PL/pgSQL code. (I can expect readers to be fluent in PL/pgSQL.) Given this executable simulation, an informal prose sketch of what it does will definitely add value.

May I ask you to fill in the body of this stub by translating the C that you have in hand?

create type internal_representation_t as(
mo_internal_representation int,
dd_internal_representation int,
ss_internal_representation numeric(1000,6));

create function internal_representation(
yy_in numeric default 0,
mo_in numeric default 0,
dd_in numeric default 0,
hh_in numeric default 0,
mi_in numeric default 0,
ss_in numeric default 0)
returns internal_representation_t
language plpgsql
as $body$
declare
mo_internal_representation int not null := 0;
dd_internal_representation int not null := 0;
ss_internal_representation numeric not null := 0.0;

ok constant boolean :=
(yy_in is not null) and
(mo_in is not null) and
(dd_in is not null) and
(hh_in is not null) and
(mi_in is not null) and
(ss_in is not null);
begin
assert ok, 'No actual argument, when provided, may be null';

-- The algorithm.

return (mo_internal_representation, dd_internal_representation, ss_internal_representation)::internal_representation_t;
end;
$body$;

By the way, I believe that a user might well decide always to supply all the fields in a "from text to interval" typecast, except for the seconds, as integral values. This, after all, is what the "make_interval()" function enforces. But, because the typecast approach allows non-integral values, the reference documentation must explain the rules unambiguously so that the reader can predict the outcome of any ad hoc test that they might try.

It's a huge pity that the three values of the internal representation cannot be observed directly using SQL because each behaves with different semantics when an interval value is added to a timestamptz value. However, as a second best (and knowing the algorithm above), a user can create interval values where only one of the three fields is populated and test their understanding of the semantic rules that way.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Brannen 2021-04-08 20:32:19 RE: Check constraint failure messages
Previous Message Bruce Momjian 2021-04-08 17:24:44 Re: Have I found an interval arithmetic bug?

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-04-08 18:35:51 Re: pgsql: autovacuum: handle analyze for partitioned tables
Previous Message Simon Riggs 2021-04-08 18:10:09 Re: VACUUM (DISABLE_PAGE_SKIPPING on)