Re: [HACKERS] Interval aggregate regression failure (expected seems

From: Michael Glaesemann <grzm(at)seespotcode(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paesold <mpaesold(at)gmx(dot)at>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [HACKERS] Interval aggregate regression failure (expected seems
Date: 2006-08-28 07:24:20
Message-ID: 16F3B056-8929-4F68-AB72-929480B1F576@seespotcode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


On Aug 26, 2006, at 11:40 , Bruce Momjian wrote:

>
> I used your ideas to make a patch to fix your example:
>
> test=> select '41 months'::interval / 10;
> ?column?
> ---------------
> 4 mons 3 days
> (1 row)
>
> and
>
> test=> select '41 months'::interval * 0.3;
> ?column?
> ---------------
> 1 year 9 days
> (1 row)
>
> The trick was not to play with the division, but to check if the
> number
> of seconds cascaded into full days and/or months.

While this does provide a fix for the example, I don't believe it's a
complete solution. For example, with your patch, you also get the
following results:

select '41 mon 360:00'::interval / 10 as "pos"
, '-41 mon -360:00'::interval / 10 as "neg";
pos | neg
------------------------+------------------------------
4 mons 2 days 60:00:00 | -4 mons -2 days -59:59:60.00
(1 row)

If I've done the math right, this should be:
4 mons 3 days 36:00:00 | -4 mons -3 days -36:00:00

select '41 mon -360:00'::interval / 10 as "pos"
, '-41 mon 360:00'::interval / 10 as "neg";
pos | neg
-------------------------+---------------------------
4 mons 2 days -12:00:00 | -4 mons -2 days +12:00:00
(1 row)

Should be:
4 mons 3 days -36:00:00 | -4 mons -3 days +36:00:00

What we want to do is check just the month contribution to the day
component to see if it is greater than 24 hours. Perhaps the simplest
way to accomplish this is something like (psuedo code):

if (abs(tsround(month_remainder * SECS_PER_DAY)) == SECS_PER_DAY)
{
if (month_remainder > 0)
{
result->month++;
}
else
{
result->month--;
}
}

I'm going to try something along these lines this evening.

FWIW, I've included the patch of for what I'm working on. It's pretty
heavily commented right now with expected results as I think through
what the code's doing. (It also includes the DecodeInterval patch I
sent to -patches earlier today.) I'm still getting overflow warnings
in the lines including USECS_PER_DAY and USECS_PER_MONTH, and my
inexperience with C and gdb is getting the best of me right now
(though I'm still plugging away: ).

Michael Glaesemann
grzm seespotcode net

----8<-------------------
? CONFIGURE_ARGS
? datetime.patch
? timestamp.patch
? src/backend/.DS_Store
? src/include/.DS_Store
? src/test/.DS_Store
Index: src/backend/utils/adt/datetime.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/datetime.c,v
retrieving revision 1.169
diff -c -r1.169 datetime.c
*** src/backend/utils/adt/datetime.c 25 Jul 2006 03:51:21 -0000 1.169
--- src/backend/utils/adt/datetime.c 28 Aug 2006 07:08:46 -0000
***************
*** 2920,2935 ****
tm->tm_mday += val * 7;
if (fval != 0)
{
! int sec;
!
! fval *= 7 * SECS_PER_DAY;
! sec = fval;
! tm->tm_sec += sec;
#ifdef HAVE_INT64_TIMESTAMP
! *fsec += (fval - sec) * 1000000;
#else
! *fsec += fval - sec;
#endif
}
tmask = (fmask & DTK_M(DAY)) ? 0 : DTK_M(DAY);
break;
--- 2920,2942 ----
tm->tm_mday += val * 7;
if (fval != 0)
{
! int extra_days;
! fval *= 7;
! extra_days = (int32) fval;
! tm->tm_mday += extra_days;
! fval -= extra_days;
! if (fval != 0)
! {
! int sec;
! fval *= SECS_PER_DAY;
! sec = fval;
! tm->tm_sec += sec;
#ifdef HAVE_INT64_TIMESTAMP
! *fsec += (fval - sec) * 1000000;
#else
! *fsec += fval - sec;
#endif
+ }
}
tmask = (fmask & DTK_M(DAY)) ? 0 : DTK_M(DAY);
break;
***************
*** 2938,2953 ****
tm->tm_mon += val;
if (fval != 0)
{
! int sec;
!
! fval *= DAYS_PER_MONTH * SECS_PER_DAY;
! sec = fval;
! tm->tm_sec += sec;
#ifdef HAVE_INT64_TIMESTAMP
! *fsec += (fval - sec) * 1000000;
#else
! *fsec += fval - sec;
#endif
}
tmask = DTK_M(MONTH);
break;
--- 2945,2967 ----
tm->tm_mon += val;
if (fval != 0)
{
! int day;
! fval *= DAYS_PER_MONTH;
! day = fval;
! tm->tm_mday += day;
! fval -= day;
! if (fval != 0)
! {
! int sec;
! fval *= SECS_PER_DAY;
! sec = fval;
! tm->tm_sec += sec;
#ifdef HAVE_INT64_TIMESTAMP
! *fsec += (fval - sec) * 1000000;
#else
! *fsec += fval - sec;
#endif
+ }
}
tmask = DTK_M(MONTH);
break;
Index: src/backend/utils/adt/timestamp.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/timestamp.c,v
retrieving revision 1.165
diff -c -r1.165 timestamp.c
*** src/backend/utils/adt/timestamp.c 13 Jul 2006 16:49:16 -0000 1.165
--- src/backend/utils/adt/timestamp.c 28 Aug 2006 07:08:48 -0000
***************
*** 2547,2556 ****
Interval *span = PG_GETARG_INTERVAL_P(0);
float8 factor = PG_GETARG_FLOAT8(1);
double month_remainder,
day_remainder,
! month_remainder_days;
Interval *result;

result = (Interval *) palloc(sizeof(Interval));

if (factor == 0.0)
--- 2547,2572 ----
Interval *span = PG_GETARG_INTERVAL_P(0);
float8 factor = PG_GETARG_FLOAT8(1);
double month_remainder,
+ month_remainder_usecs = 0.,
day_remainder,
! day_remainder_usecs = 0.;
! int32 extra_days;
Interval *result;

+ /*
+ example a:
+ select '1.5 mon'::interval / 10;
+ span = { time = 0, day = 15, month = 1 }
+ factor = 10.0
+ result: 4 days 12:00 { time = 43200.0 (int64 43_200_000_000),
day = 4, month = 0 }
+
+ example b:
+ select '41 mon'::interval / 10;
+ span = { time = 0, day = 0, month = 41 }
+ factor = 10.0
+ result: 4 mon 3 days { time = 0, day = 3, month = 4 }
+ */
+
result = (Interval *) palloc(sizeof(Interval));

if (factor == 0.0)
***************
*** 2559,2584 ****
errmsg("division by zero")));

month_remainder = span->month / factor;
! day_remainder = span->day / factor;
result->month = (int32) month_remainder;
! result->day = (int32) day_remainder;
! month_remainder -= result->month;
! day_remainder -= result->day;

/*
! * Handle any fractional parts the same way as in interval_mul.
*/

/* fractional months full days into days */
! month_remainder_days = month_remainder * DAYS_PER_MONTH;
! result->day += (int32) month_remainder_days;
! /* fractional months partial days into time */
! day_remainder += month_remainder_days - (int32) month_remainder_days;

#ifdef HAVE_INT64_TIMESTAMP
! result->time = rint(span->time / factor + day_remainder *
USECS_PER_DAY);
#else
! result->time = span->time / factor + day_remainder * SECS_PER_DAY;
#endif

PG_RETURN_INTERVAL_P(result);
--- 2575,2682 ----
errmsg("division by zero")));

month_remainder = span->month / factor;
! /*
! a: month_remainder = 1 / 10.0 = 0.1
! b: month_remainder = 41 / 10.0 = 4.1
! */
result->month = (int32) month_remainder;
! /*
! a: result->month = 0
! b: result->month = 4
! */
! /* FIXME integer overflow */
! month_remainder_usecs = rint( (double) (month_remainder - result-
>month)
! * USECS_PER_MONTH);
! /*
! a: month_remainder_usecs = rint((0.1 - 0) * ((double)
DAYS_PER_MONTH * (double) USECS_PER_DAY))
! = rint(0.1 * (30.0 * (SECS_PER_DAY * 1000000)))
! = rint(0.1 * 30.0 * 86400 * 1000000)
! = 259_200_000_000
! b: month_remainder_usecs = rint((4.1 - 4) * ((double)
DAYS_PER_MONTH * (double) USECS_PER_DAY))
! = rint( 0.1 * 30.0 * 86400 * 1000000)
! = 259_200_000_000
! */
!
! /*
! * Due to the inherent inaccuracy of floating point division,
round to
! * microsecond accuracy. Scale up intermediate results to
microseconds
! * and scale back down for the final result.
! */

+ day_remainder = span->day / factor;
/*
! a: day_remainder = 15 / 10.0 = 1.5
! b: day_remainder = 0 / 10.0 = 0.0
! */
! result->day = (int32) day_remainder;
! /*
! a: result->day = (int32) 1.5 = 1
! b: result->day = (int32) 0.0 = 0
! */
! /* FIXME integer overflow */
! day_remainder_usecs = rint( (double) (day_remainder - result->day)
* (double) USECS_PER_DAY);
! /*
! a: day_remainder_usecs = rint((1.5 - 1) * (SECS_PER_DAY * 1000000))
! = rint( 0.5 * 86400 * 1_000_000 )
! = 43_200_000_000
! b: day_remainder_usecs = rint((0.0 - 0 ) * 86400 * 1_000_000)
! = 0
! */
!
! /*
! * Handle any fractional parts the same way as in interval_mul.
! * Fractional months and fractional days are added to the result
separately
! * to prevent their time fractional components from contributing
to the day
! * component.
*/

/* fractional months full days into days */
! /* FIXME integer overflow */
! extra_days = (int32) (month_remainder_usecs / (double)
USECS_PER_DAY);
! /*
! a: extra_days = (int32) (259_200_000_000 / (SECS_PER_DAY * 1000000))
! = (int32) (259_200_000_000 / (86400 * 1000000))
! = (int32) 3.0
! = 3
! b: extra_days = (int32) (259_200_000_000 / (86400 * 1000000) )
! = (int32) 3.0
! = 3
! */
! result->day += extra_days;
! /*
! a: result->day = 1 + 3 = 4
! b: result->day = 0 + 3 = 3
! */
! /* FIXME integer overflow */
! month_remainder_usecs -= extra_days * (double) USECS_PER_DAY;
! /*
! a: month_remainder_usecs = 259_200_000_000 - (3 * (86400 * 1000000))
! = 259_200_000_000 - (3 * (86400 * 1000000))
! = 259_200_000_000 - 259_200_000_000
! = 0
! b: month_remainder_usecs = 259_200_000_000 - (3 * (86400 * 1000000))
! = 259_200_000_000 - 259_200_000_000
! = 0
! */

#ifdef HAVE_INT64_TIMESTAMP
! result->time = rint(span->time / factor +
! day_remainder_usecs + month_remainder_usecs);
! /*
! a: result->time = rint(0.0 / 10.0 + 43_200_000_000 + 0.0)
! = 43_200_000_000
! b: result->time = rint(0.0 / 10.0 + 0 + 0.0)
! = 0
! */
#else
! result->time = rint(span->time / factor * 1.0e6 * +
! day_remainder_usecs + month_remainder_usecs) / 1.0e6;
! /*
! a: result->time = rint(0.0 / 10.0 * 1.0e6 + 43_200_000_000 +
0.0) / 1.0e6
! = 43200.0
! b: result->time = rint(0.0 / 10.0 * 1.0e6 + 0.0 + 0.0) / 1.0e6
! = 0.0
! */
#endif

PG_RETURN_INTERVAL_P(result);
Index: src/include/utils/timestamp.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/utils/timestamp.h,v
retrieving revision 1.62
diff -c -r1.62 timestamp.h
*** src/include/utils/timestamp.h 13 Jul 2006 16:49:20 -0000 1.62
--- src/include/utils/timestamp.h 28 Aug 2006 07:08:51 -0000
***************
*** 79,92 ****
#define SECS_PER_YEAR (36525 * 864) /* avoid floating-point
computation */
#define SECS_PER_DAY 86400
#define SECS_PER_HOUR 3600
! #define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60

#ifdef HAVE_INT64_TIMESTAMP
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
! #define USECS_PER_MINUTE INT64CONST(60000000)
#define USECS_PER_SEC INT64CONST(1000000)
#endif

/*
--- 79,96 ----
#define SECS_PER_YEAR (36525 * 864) /* avoid floating-point
computation */
#define SECS_PER_DAY 86400
#define SECS_PER_HOUR 3600
! #define SECS_PER_MINUTE 60
#define MINS_PER_HOUR 60

#ifdef HAVE_INT64_TIMESTAMP
+ #define USECS_PER_MONTH INT64CONST(2592000000000)
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR INT64CONST(3600000000)
! #define USECS_PER_MINUTE INT64CONST(60000000)
#define USECS_PER_SEC INT64CONST(1000000)
+ #else
+ #define USECS_PER_DAY (SECS_PER_DAY * 1000000)
+ #define USECS_PER_MONTH ((double) DAYS_PER_MONTH * (double)
USECS_PER_DAY)
#endif

/*

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Böszörményi Zoltán 2006-08-28 08:58:43 Re: [HACKERS] Performance testing of COPY (SELECT) TO
Previous Message Pavel Stehule 2006-08-28 05:04:27 plpgsql, return can contains any expression

Browse pgsql-patches by date

  From Date Subject
Next Message Böszörményi Zoltán 2006-08-28 08:58:43 Re: [HACKERS] Performance testing of COPY (SELECT) TO
Previous Message Pavel Stehule 2006-08-28 05:04:27 plpgsql, return can contains any expression