Re: ISO 8601 "Time Intervals" of the "format with time-unit

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Ron Mayer <ron(at)intervideo(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: ISO 8601 "Time Intervals" of the "format with time-unit
Date: 2003-09-08 20:29:56
Message-ID: 200309082029.h88KTu126612@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-advocacy pgsql-general pgsql-hackers pgsql-patches


This has been saved for the 7.5 release:

http:/momjian.postgresql.org/cgi-bin/pgpatches2

Feel free to submit an updated patch that rips out the old syntax, as
discussed, or replace this patch with a more comprehensive one.

---------------------------------------------------------------------------

Ron Mayer wrote:
> Short summary:
>
> This patch allows ISO 8601 "time intervals" using the "format
> with time-unit designators" to specify postgresql "intervals".
>
> Below I have (A) What these time intervals are, (B) What I
> modified to support them, (C) Issues with intervals I want
> to bring up, and (D) a patch supporting them.
>
> It's helpful to me. Any feedback is appreciated. If you
> did want to consider including it, let me know what to clean
> up. If not, I thought I'd just put it here if anyone else finds
> it useful too.
>
> Thanks for your time,
>
> Ron Mayer
>
> Longer:
>
> (A) What these intervals are.
>
> ISO 8601, the standard from which PostgreSQL gets some of it's
> time syntax, also has a specification for "time-intervals".
>
> In particular, section 5.5.4.2 has a "Representation of
> time-interval by duration only" which I believe maps
> nicely to ISO intervals.
>
> Compared to the ISO 8601 time interval specification, the
> postgresql interval syntax is quite verbose. For example:
>
> Postgresql interval: ISO8601 Interval
> ---------------------------------------------------
> '1 year 6 months' 'P1Y6M'
> '3 hours 25 minutes 42 seconds' 'PT3H25M42S'
>
> Yeah, it's uglier, but it sure is short which can make
> for quicker typing and shorter scripts, and if for some
> strange reason you had an application using this format
> it's nice not to have to translate.
>
> The syntax is as follows:
> Basic extended format: PnYnMnDTnHnMnS
> PnW
>
> Where everything before the "T" is a date-part and everything
> after is a time-part. W is for weeks.
> In the date-part, Y=Year, M=Month, D=Day
> In the time-part, H=Hour, M=Minute, S=Second
>
> Much more info can be found from the draft standard
> ftp://ftp.qsl.net/pub/g1smd/154N362_.PDF
> The final standard's only available for $$$ so I didn't
> look at it. Some other sites imply that this part didn't
> change from the last draft to the standard.
>
>
> (B) This change was made by adding two functions to "datetime.c"
> next to where DecodeInterval parses the normal interval syntax.
>
> A total of 313 lines were added, including comments and sgml docs.
> Of these only 136 are actual code, the rest, comments, whitespace, etc.
>
>
> One new function "DecodeISO8601Interval" follows the style of
> "DecodeInterval" below it, and trys to strictly follow the ISO
> syntax. If it doesn't match, it'll return -1 and the old syntax
> will be checked as before.
>
> The first test (first character of the first field must be 'P',
> and second character must be 'T' or '\0') should be fast so I don't
> think this will impact performance of existing code.
>
>
> The second function ("adjust_fval") is just a small helper-function
> to remove some of the cut&paste style that DecodeInterval used.
>
> It seems to work.
> =======================================================================
> betadb=# select 'P1M15DT12H30M7S'::interval;
> interval
> ------------------------
> 1 mon 15 days 12:30:07
> (1 row)
>
> betadb=# select '1 month 15 days 12 hours 30 minutes 7 seconds'::interval;
> interval
> ------------------------
> 1 mon 15 days 12:30:07
> (1 row)
> =====================================================================
>
>
>
> (C) Open issues with intervals, and questions I'd like to ask.
>
> 1. DecodeInterval seems to have a hardcoded '.' for specifying
> fractional times. ISO 8601 states that both '.' and ',' are
> ok, but "of these, the comma is the preferred sign".
>
> In DecodeISO8601Interval I loosened the test to allow
> both but left it as it was in DecodeInterval. Should
> both be changed to make them more consistant?
>
> 2. In "DecodeInterval", fractional weeks and fractional months
> can produce seconds; but fractional years can not (rounded
> to months). I didn't understand the reasoning for this, so
> I left it the same, and followed the same convention for
> ISO intervals. Should I change this?
>
> 3. I could save a bunch of copy-paste-lines-of-code from the
> pre-existing DecodeInterval by calling the adjust_fval helper
> function. The tradeoff is a few extra function-calls when
> decoding an interval. However I didn't want to risk changes
> to the existing part unless you guys encourage me to do so.
>
>
> (D) The patch.
>
>
> Index: doc/src/sgml/datatype.sgml
> ===================================================================
> RCS file: /projects/cvsroot/pgsql-server/doc/src/sgml/datatype.sgml,v
> retrieving revision 1.123
> diff -u -1 -0 -r1.123 datatype.sgml
> --- doc/src/sgml/datatype.sgml 31 Aug 2003 17:32:18 -0000 1.123
> +++ doc/src/sgml/datatype.sgml 8 Sep 2003 04:04:58 -0000
> @@ -1735,20 +1735,71 @@
> Quantities of days, hours, minutes, and seconds can be specified without
> explicit unit markings. For example, <literal>'1 12:59:10'</> is read
> the same as <literal>'1 day 12 hours 59 min 10 sec'</>.
> </para>
>
> <para>
> The optional precision
> <replaceable>p</replaceable> should be between 0 and 6, and
> defaults to the precision of the input literal.
> </para>
> +
> +
> + <para>
> + Alternatively, <type>interval</type> values can be written as
> + ISO 8601 time intervals, using the "Format with time-unit designators".
> + This format always starts with the character <literal>'P'</>, followed
> + by a string of values followed by single character time-unit designators.
> + A <literal>'T'</> separates the date and time parts of the interval.
> + </para>
> +
> + <para>
> + Format: PnYnMnDTnHnMnS
> + </para>
> + <para>
> + In this format, <literal>'n'</> gets replaced by a number, and
> + <literal>Y</> represents years,
> + <literal>M</> (in the date part) months,
> + <literal>D</> months,
> + <literal>H</> hours,
> + <literal>M</> (in the time part) minutes,
> + and <literal>S</> seconds.
> + </para>
> +
> +
> + <table id="interval-example-table">
> + <title>Interval Example</title>
> + <tgroup cols="2">
> + <thead>
> + <row>
> + <entry>Traditional</entry>
> + <entry>ISO-8601 time-interval</entry>
> + </row>
> + </thead>
> + <tbody>
> + <row>
> + <entry>1 month</entry>
> + <entry>P1M</entry>
> + </row>
> + <row>
> + <entry>1 hour 30 minutes</entry>
> + <entry>PT1H30M</entry>
> + </row>
> + <row>
> + <entry>2 years 10 months 15 days 10 hours 30 minutes 20 seconds</entry>
> + <entry>P2Y10M15DT10H30M20S</entry>
> + </row>
> + </tbody>
> + </thead>
> + </table>
> +
> + </para>
> </sect3>
>
> <sect3>
> <title>Special Values</title>
>
> <indexterm>
> <primary>time</primary>
> <secondary>constants</secondary>
> </indexterm>
>
> Index: src/backend/utils/adt/datetime.c
> ===================================================================
> RCS file: /projects/cvsroot/pgsql-server/src/backend/utils/adt/datetime.c,v
> retrieving revision 1.116
> diff -u -1 -0 -r1.116 datetime.c
> --- src/backend/utils/adt/datetime.c 27 Aug 2003 23:29:28 -0000 1.116
> +++ src/backend/utils/adt/datetime.c 8 Sep 2003 04:04:59 -0000
> @@ -30,20 +30,21 @@
> struct tm * tm, fsec_t *fsec, int *is2digits);
> static int DecodeNumberField(int len, char *str,
> int fmask, int *tmask,
> struct tm * tm, fsec_t *fsec, int *is2digits);
> static int DecodeTime(char *str, int fmask, int *tmask,
> struct tm * tm, fsec_t *fsec);
> static int DecodeTimezone(char *str, int *tzp);
> static datetkn *datebsearch(char *key, datetkn *base, unsigned int nel);
> static int DecodeDate(char *str, int fmask, int *tmask, struct tm * tm);
> static void TrimTrailingZeros(char *str);
> +static int DecodeISO8601Interval(char **field, int *ftype, int nf, int *dtype, struct tm * tm, fsec_t *fsec);
>
>
> int day_tab[2][13] = {
> {31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31, 0},
> {31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31, 0}};
>
> char *months[] = {"Jan", "Feb", "Mar", "Apr", "May", "Jun",
> "Jul", "Aug", "Sep", "Oct", "Nov", "Dec", NULL};
>
> char *days[] = {"Sunday", "Monday", "Tuesday", "Wednesday",
> @@ -2872,30 +2873,271 @@
> default:
> *val = tp->value;
> break;
> }
> }
>
> return type;
> }
>
>
> +void adjust_fval(double fval,struct tm * tm, fsec_t *fsec, int scale);
> +{
> + int sec;
> + fval *= scale;
> + sec = fval;
> + tm->tm_sec += sec;
> +#ifdef HAVE_INT64_TIMESTAMP
> + *fsec += ((fval - sec) * 1000000);
> +#else
> + *fsec += (fval - sec);
> +#endif
> +}
> +
> +
> +/* DecodeISO8601Interval()
> + *
> + * Check if it's a ISO 8601 Section 5.5.4.2 "Representation of
> + * time-interval by duration only."
> + * Basic extended format: PnYnMnDTnHnMnS
> + * PnW
> + * For more info.
> + * http://www.astroclark.freeserve.co.uk/iso8601/index.html
> + * ftp://ftp.qsl.net/pub/g1smd/154N362_.PDF
> + *
> + * Examples: P1D for 1 day
> + * PT1H for 1 hour
> + * P2Y6M7DT1H30M for 2 years, 6 months, 7 days 1 hour 30 min
> + *
> + * The first field is exactly "p" or "pt" it may be of this type.
> + *
> + * Returns -1 if the field is not of this type.
> + *
> + * It pretty strictly checks the spec, with the two exceptions
> + * that a week field ('W') may coexist with other units, and that
> + * this function allows decimals in fields other than the least
> + * significant units.
> + */
> +int
> +DecodeISO8601Interval(char **field, int *ftype, int nf, int *dtype, struct tm * tm, fsec_t *fsec)
> +{
> + char *cp;
> + int fmask = 0,
> + tmask;
> + int val;
> + double fval;
> + int arg;
> + int datepart;
> +
> + /*
> + * An ISO 8601 "time-interval by duration only" must start
> + * with a 'P'. If it contains a date-part, 'p' will be the
> + * only character in the field. If it contains no date part
> + * it will contain exactly to characters 'PT' indicating a
> + * time part.
> + * Anything else is illegal and will be treated like a
> + * traditional postgresql interval.
> + */
> + if (!(field[0][0] == 'p' &&
> + ((field[0][1] == 0) || (field[0][1] == 't' && field[0][2] == 0))))
> + {
> + return -1;
> + }
> +
> +
> + /*
> + * If the first field is exactly 1 character ('P'), it starts
> + * with date elements. Otherwise it's two characters ('PT');
> + * indicating it starts with a time part.
> + */
> + datepart = (field[0][1] == 0);
> +
> + /*
> + * Every value must have a unit, so we require an even
> + * number of value/unit pairs. Therefore we require an
> + * odd nubmer of fields, including the prefix 'P'.
> + */
> + if ((nf & 1) == 0)
> + return -1;
> +
> + /*
> + * Process pairs of fields at a time.
> + */
> + for (arg = 1 ; arg < nf ; arg+=2)
> + {
> + char * value = field[arg ];
> + char * units = field[arg+1];
> +
> + /*
> + * The value part must be a number.
> + */
> + if (ftype[arg] != DTK_NUMBER)
> + return -1;
> +
> + /*
> + * extract the number, almost exactly like the non-ISO interval.
> + */
> + val = strtol(value, &cp, 10);
> +
> + /*
> + * One difference from the normal postgresql interval below...
> + * ISO 8601 states that "Of these, the comma is the preferred
> + * sign" so I allow it here for locales that support it.
> + * Note: Perhaps the old-style interval code below should
> + * allow for this too, but I didn't want to risk backward
> + * compatability.
> + */
> + if (*cp == '.' || *cp == ',')
> + {
> + fval = strtod(cp, &cp);
> + if (*cp != '\0')
> + return -1;
> +
> + if (val < 0)
> + fval = -(fval);
> + }
> + else if (*cp == '\0')
> + fval = 0;
> + else
> + return -1;
> +
> +
> + if (datepart)
> + {
> + /*
> + * All the 8601 unit specifiers are 1 character, but may
> + * be followed by a 'T' character if transitioning between
> + * the date part and the time part. If it's not either
> + * one character or two characters with the second being 't'
> + * it's an error.
> + */
> + if (!(units[1] == 0 || (units[1] == 't' && units[2] == 0)))
> + return -1;
> +
> + if (units[1] == 't')
> + datepart = 0;
> +
> + switch (units[0]) /* Y M D W */
> + {
> + case 'd':
> + tm->tm_mday += val;
> + if (fval != 0)
> + adjust_fval(fval,tm,fsec, 86400);
> + tmask = ((fmask & DTK_M(DAY)) ? 0 : DTK_M(DAY));
> + break;
> +
> + case 'w':
> + tm->tm_mday += val * 7;
> + if (fval != 0)
> + adjust_fval(fval,tm,fsec,7 * 86400);
> + tmask = ((fmask & DTK_M(DAY)) ? 0 : DTK_M(DAY));
> + break;
> +
> + case 'm':
> + tm->tm_mon += val;
> + if (fval != 0)
> + adjust_fval(fval,tm,fsec,30 * 86400);
> + tmask = DTK_M(MONTH);
> + break;
> +
> + case 'y':
> + /*
> + * Why can fractional months produce seconds,
> + * but fractional years can't? Well the older
> + * interval code below has the same property
> + * so this one follows the other one too.
> + */
> + tm->tm_year += val;
> + if (fval != 0)
> + tm->tm_mon += (fval * 12);
> + tmask = ((fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR));
> + break;
> +
> + default:
> + return -1; /* invald date unit prefix */
> + }
> + }
> + else
> + {
> + /*
> + * ISO 8601 time part.
> + * In the time part, only one-character
> + * unit prefixes are allowed. If it's more
> + * than one character, it's not a valid ISO 8601
> + * time interval by duration.
> + */
> + if (units[1] != 0)
> + return -1;
> +
> + switch (units[0]) /* H M S */
> + {
> + case 's':
> + tm->tm_sec += val;
> +#ifdef HAVE_INT64_TIMESTAMP
> + *fsec += (fval * 1000000);
> +#else
> + *fsec += fval;
> +#endif
> + tmask = DTK_M(SECOND);
> + break;
> +
> + case 'm':
> + tm->tm_min += val;
> + if (fval != 0)
> + adjust_fval(fval,tm,fsec,60);
> + tmask = DTK_M(MINUTE);
> + break;
> +
> + case 'h':
> + tm->tm_hour += val;
> + if (fval != 0)
> + adjust_fval(fval,tm,fsec,3600);
> + tmask = DTK_M(HOUR);
> + break;
> +
> + default:
> + return -1; /* invald time unit prefix */
> + }
> + }
> + fmask |= tmask;
> + }
> +
> + if (*fsec != 0)
> + {
> + int sec;
> +
> +#ifdef HAVE_INT64_TIMESTAMP
> + sec = (*fsec / INT64CONST(1000000));
> + *fsec -= (sec * INT64CONST(1000000));
> +#else
> + TMODULO(*fsec, sec, 1e0);
> +#endif
> + tm->tm_sec += sec;
> + }
> + return (fmask != 0) ? 0 : -1;
> +}
> +
> +
> /* DecodeInterval()
> * Interpret previously parsed fields for general time interval.
> * Returns 0 if successful, DTERR code if bogus input detected.
> *
> * Allow "date" field DTK_DATE since this could be just
> * an unsigned floating point number. - thomas 1997-11-16
> *
> * Allow ISO-style time span, with implicit units on number of days
> * preceding an hh:mm:ss field. - thomas 1998-04-30
> + *
> + * Allow ISO-8601 style "Representation of time-interval by duration only"
> + * of the format 'PnYnMnDTnHnMnS' and 'PnW' - ron 2003-08-30
> */
> +
> int
> DecodeInterval(char **field, int *ftype, int nf, int *dtype, struct tm * tm, fsec_t *fsec)
> {
> int is_before = FALSE;
> char *cp;
> int fmask = 0,
> tmask,
> type;
> int i;
> int dterr;
> @@ -2906,20 +3148,37 @@
>
> type = IGNORE_DTF;
> tm->tm_year = 0;
> tm->tm_mon = 0;
> tm->tm_mday = 0;
> tm->tm_hour = 0;
> tm->tm_min = 0;
> tm->tm_sec = 0;
> *fsec = 0;
>
> + /*
> + * Check if it's a ISO 8601 Section 5.5.4.2 "Representation of
> + * time-interval by duration only."
> + * Basic extended format: PnYnMnDTnHnMnS
> + * PnW
> + * http://www.astroclark.freeserve.co.uk/iso8601/index.html
> + * ftp://ftp.qsl.net/pub/g1smd/154N362_.PDF
> + * Examples: P1D for 1 day
> + * PT1H for 1 hour
> + * P2Y6M7DT1H30M for 2 years, 6 months, 7 days 1 hour 30 min
> + *
> + * The first field is exactly "p" or "pt" it may be of this type.
> + */
> + if (DecodeISO8601Interval(field,ftype,nf,dtype,tm,fsec) == 0) {
> + return 0;
> + }
> +
> /* read through list backwards to pick up units before values */
> for (i = nf - 1; i >= 0; i--)
> {
> switch (ftype[i])
> {
> case DTK_TIME:
> dterr = DecodeTime(field[i], fmask, &tmask, tm, fsec);
> if (dterr)
> return dterr;
> type = DTK_DAY;
> @@ -2983,20 +3242,21 @@
> }
> /* DROP THROUGH */
>
> case DTK_DATE:
> case DTK_NUMBER:
> val = strtol(field[i], &cp, 10);
>
> if (type == IGNORE_DTF)
> type = DTK_SECOND;
>
> + /* should this allow ',' for locales that use it ? */
> if (*cp == '.')
> {
> fval = strtod(cp, &cp);
> if (*cp != '\0')
> return DTERR_BAD_FORMAT;
>
> if (val < 0)
> fval = -(fval);
> }
> else if (*cp == '\0')
>
> ===================================================================
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Browse pgsql-advocacy by date

  From Date Subject
Next Message Ron Mayer 2003-09-08 20:42:11 Re: ISO 8601 "Time Intervals" of the "format with time-unit deignators"
Previous Message Bruce Momjian 2003-09-08 20:27:16 Re: ISO 8601 "Time Intervals" of the "format with time-unit

Browse pgsql-general by date

  From Date Subject
Next Message Ron Mayer 2003-09-08 20:42:11 Re: ISO 8601 "Time Intervals" of the "format with time-unit deignators"
Previous Message Bruce Momjian 2003-09-08 20:27:16 Re: ISO 8601 "Time Intervals" of the "format with time-unit

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-09-08 20:32:36 Re: constraint modification on todo list
Previous Message Tom Lane 2003-09-08 20:28:38 Re: plpgsql doesn't coerce boolean expressions to boolean

Browse pgsql-patches by date

  From Date Subject
Next Message Ron Mayer 2003-09-08 20:42:11 Re: ISO 8601 "Time Intervals" of the "format with time-unit deignators"
Previous Message Bruce Momjian 2003-09-08 20:27:16 Re: ISO 8601 "Time Intervals" of the "format with time-unit