Re: Re: to_date problems (Re: Favor for Postgres User at WSI)

From: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
To: Karel Zak <zakkr(at)zf(dot)jcu(dot)cz>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kate Collins <klcollins(at)wsicorp(dot)com>, pgsql-bugs(at)postgresql(dot)org, John Rickman <john(dot)rickman(at)greatbridge(dot)com>
Subject: Re: Re: to_date problems (Re: Favor for Postgres User at WSI)
Date: 2000-11-11 17:50:13
Message-ID: 3A0D86D5.765EC7D9@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

(Sorry for diving in late; I was out of town the last few days)

> > This case I *would* have expected to produce 1 BC, but nope...
> Where is *guarantee* that the year is 4-digits?!

There is no guarantee of only four digits, but there is a convention
that two digit years refer to the current/previous/next century. I've
worked through the same issues with the other date/time types, and have
evolved the code through exactly the same path you are taking. And been
annoyed by all of the arcane details in doing it ;)

> test=# select '123456-11-12'::date;
> ?column?
> --------------
> 123456-11-12
> (1 row)
> The to_char/timestamp not try expect that YYYY = 4-digits (see docs)

No problem there. But it will be good to conform to the 4 digit/2 digit
conventions when 4 or 2 digits appear in the year field.

> And 'YY' - it's hell, what is '00'? ... 1900 or 2000 or 20000?

Depends on what year is specified. Our Y2K statement (may it rest in
peace; seems pretty silly now, eh?) specifies the expected behavior. We
currently use 1970 as the transition for centuries with two digit years
(I did this as a nod to Unix) but I believe it is more common to use
1950 as the transition year. I don't feel a need to move to this more
common convention, but would be willing to do so if others see that as
helpful or important.

> > That seems broken in current sources, too:
> > regression=# select to_date( '20001112', 'YYYYMMDD');
> > ERROR: Unable to convert timestamp to date
> Yes, because to_date() expect that year is '20001112' and internal
> PG's date/time routines disable convert it.

The other date/time routines have heuristics when parsing long integer
strings.

2 chars is a yy, mm, or dd
3 chars is a day number
4 chars is a yyyy
5 chars is a yyddd
6 chars is a yymmdd

These are documented in the appendix on parsing date/times.

> test=# select to_date( '2000-1112', 'YYYY-MMDD');
> to_date
> ------------
> 2000-11-12
> > Looks like you've rooted out a number of problems in to_date (which
> > in fairness is new-in-7.0 code). I've cc'd this to to_date's author,
> > whom I hope will find a fix for 7.1.
> How? Create limit for YYYY to 4-digits?

Solved with heuristics.

> > BTW, direct conversion to timestamp does something sensible in all
> > these cases:
> > regression=# select '001112'::timestamp;
> > ?column?
> > ------------------------
> > 2000-11-12 00:00:00-05
> > (1 row)
> Why not 1900?

Because of the common and documented cutoff date (1970 currently, 1950
in some other apps) used to solve this problem.

> test=# select to_char('1900-11-12'::date, 'YYDDMM')::timestamp;
> ?column?
> ------------------------
> 2000-12-11 00:00:00+01
> (1 row)
> What is right here? Really '00' = 2000? .... but input is 1900

That is operator error; converting a year outside the cutoff range to a
string and then converting it back to a date/time type gets you what you
deserve for using two-digit years. (I know you did this for
illustration, but two digit years can be dangerous, as you are pointing
out).

> test=# select '200001112'::timestamp;
> ERROR: Bad timestamp external representation '200001112'
> ???

I was going to claim that the "long year" is covered in the existing
heuristics, but I'll now claim that rejecting arbitrarily long,
undelimited ISO dates is preferred ;)

> Well, possible solution: to_timestamp/date see if in the input is
> some separator (non-digit char) between YYYY and next template (like DD),
> if separator not here to_date() will expect 4-digits year.
> '20001112' not separator --> 4-digits for year = 2000
> '20000-11-12' separator '-' --> read it as 20000
> '200001112' not separator --> 4-digits for year = 2000
> --> 2-digits for month = 01
> --> 2-digits for day = 11
> --> last '2' is ignored
> With 'YY' *hell* I not sure... add current year IMHO not must be
> correct. I mean that correct solution is:
> test=# select to_date('00-12-11', 'YY-DD-MM');
> ERROR: Can't convert 'YY' to a real year.
> But if users want for their Oracle->PG port use 'YY' as last two digits
> in the current year, not problem make it....

Karel, I can help polish the heuristics with you. That 1970/1950
convention is something you can rely on.

- Thomas

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Karel Zak 2000-11-12 11:18:14 Re: Re: to_date problems (Re: Favor for Postgres User at WSI)
Previous Message Frank Miles 2000-11-10 18:29:19 Re: index(fct(primary key)) kills INSERTs