Re: Rectifying wrong Date outputs

From: Piyush Newe <piyush(dot)newe(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Rectifying wrong Date outputs
Date: 2011-03-22 09:07:09
Message-ID: AANLkTin1R_hwGpyq788_JrsUDPJ2sR0SEbpcHLh=rx1z@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks Heikki, Tom & Robert for your valuable inputs.

According to the code, PG is behaving what is mentioned below.

1. Format = Y
0 ... 9 = 2000 ... 2009 (we are always adding 2000 to the year)

2. Format = YY
00 ... 69 = 2000 ... 2069 (we are adding 2000 to the year)
70 ... 99 = 1970 ... 1999 (we are adding 1900 to the year)

3. Format = YYY
100 ... 999 = 1100 ... 1999 (we are adding 1000 to the year)
000 ... 099 = 2000 ... 2099 (we are adding 2000 to the year)

4. Format = YYYY
Any number -4712 to 9999

If we closely look at the code, we can say that, if the given INPUT value is
not in the range of the DESIRED format (i.e. Y/YY/YYY/YYYY), then it results
some weired YEAR.

e.g.
1. TO_DATE('01-jan-2010', 'DD-MON-Y')
Here it falls in the 1st format case i.e. "Y". As per the code, we are
blindly adding 2000 in the year value, hence the result is returned as
4010.

2. TO_DATE('01-jan-2010', 'DD-MON-YYY')
Here it falls in the 3rd case i.e. "YYY". As per the code, without checking
the input we are adding 1000 to the value & hence it results 3010.

IMHO, before deciding the century we should check what is the INPUT. This
check is missing in the code. As Heikki said, we really don't have such
document anywhere. We need to atleast document what we are doing. However,
if we are doing something vague then we need to decide what we are really
going to follow. We need to follow one standard, then it might be either
Oracle or something else.

It might happened that the given input contains the Century. In this case,
we should intelligent enough and decide not to add anything to the year. To
avoid this situation, I still feel we have to follow the format given by the
user. i.e. "if the number of digits specified exceeds the number of Y",
lets throw an error. This will make our life easier & relatively easier to
fix.

Robert,
If we follow the standard what Oracle is using, we will not break any case.

One more observation in Oracle

SQL> SELECT to_char(TO_DATE('01-jan-0001', 'DD-MON-YY'), 'DD-MON-YYYY')
from dual;

TO_CHAR(TO_DATE('01-
--------------------
01-JAN-0001

SQL> SELECT to_char(TO_DATE('01-jan-1', 'DD-MON-YY'), 'DD-MON-YYYY') from
dual;

TO_CHAR(TO_DATE('01-
--------------------
01-JAN-2001

If we observe this closely, in the later case, century is not given in the
input, hence Oracle is taking the current century. But the first case is bit
puzzling. Because if we convert the '0001' to number it is still interprets
as 1, then why it is not adding current century their ? I think it is
checking the INPUT & accordingly taking the decision.

Anyways, will it work if we check the input before deciding the century ?

On Mon, Mar 21, 2011 at 10:35 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Mon, Mar 21, 2011 at 12:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> >> On Mon, Mar 21, 2011 at 10:18 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >>> As far as I can see, that would completely destroy the use-case of
> >>> trying to parse a string where there's not non-digit delimiters and
> >>> so you have to take exactly the specified number of digits, not more.
> >
> >> Yeah, I thought about that, but it seems that use case is already
> >> hopelessly broken in both PostgreSQL and Oracle, so I'm disinclined to
> >> worry about it.
> >
> > How so?
> >
> > regression=# select to_date('20110321', 'YYYYMMDD');
> > to_date
> > ------------
> > 2011-03-21
> > (1 row)
> >
> > regression=# select to_date('110321', 'YYMMDD');
> > to_date
> > ------------
> > 2011-03-21
> > (1 row)
> >
> > If you break the latter case, I am sure the villagers will be on your
> > doorstep shortly.
>
> Oh, dear. No wonder this code is so hard to get right.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

--
--
Piyush S Newe
Principal Engineer
EnterpriseDB
office: +91 20 3058 9500
www.enterprisedb.com

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are not
the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message KONDO Mitsumasa 2011-03-22 10:11:44 Re: 2nd Level Buffer Cache
Previous Message Peter Eisentraut 2011-03-22 08:39:01 Re: Collations versus record-returning functions