Re: Re: to_date problems (Re: Favor for Postgres User at WSI)

From: Karel Zak <zakkr(at)zf(dot)jcu(dot)cz>
To: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kate Collins <klcollins(at)wsicorp(dot)com>, pgsql-bugs(at)postgresql(dot)org, John Rickman <john(dot)rickman(at)greatbridge(dot)com>
Subject: Re: Re: to_date problems (Re: Favor for Postgres User at WSI)
Date: 2000-11-12 11:18:14
Message-ID: Pine.LNX.3.96.1001112120045.30482A-100000@ara.zf.jcu.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


On Sat, 11 Nov 2000, Thomas Lockhart wrote:

> > > This case I *would* have expected to produce 1 BC, but nope...
> > Where is *guarantee* that the year is 4-digits?!
>
> There is no guarantee of only four digits, but there is a convention
> that two digit years refer to the current/previous/next century. I've
> worked through the same issues with the other date/time types, and have
> evolved the code through exactly the same path you are taking. And been
> annoyed by all of the arcane details in doing it ;)
>
> > test=# select '123456-11-12'::date;
> > ?column?
> > --------------
> > 123456-11-12
> > (1 row)
> > The to_char/timestamp not try expect that YYYY = 4-digits (see docs)
>
> No problem there. But it will be good to conform to the 4 digit/2 digit
> conventions when 4 or 2 digits appear in the year field.
>
> > And 'YY' - it's hell, what is '00'? ... 1900 or 2000 or 20000?
>
> Depends on what year is specified. Our Y2K statement (may it rest in
> peace; seems pretty silly now, eh?) specifies the expected behavior. We
> currently use 1970 as the transition for centuries with two digit years
> (I did this as a nod to Unix) but I believe it is more common to use
> 1950 as the transition year. I don't feel a need to move to this more
> common convention, but would be willing to do so if others see that as
> helpful or important.
>
> > > That seems broken in current sources, too:
> > > regression=# select to_date( '20001112', 'YYYYMMDD');
> > > ERROR: Unable to convert timestamp to date
> > Yes, because to_date() expect that year is '20001112' and internal
> > PG's date/time routines disable convert it.
>
> The other date/time routines have heuristics when parsing long integer
> strings.
>
> 2 chars is a yy, mm, or dd
> 3 chars is a day number
> 4 chars is a yyyy
> 5 chars is a yyddd
> 6 chars is a yymmdd

Yes, but in to_char/date must be parsing regulate by format-template and
inputs can be very heterogeneous.

> These are documented in the appendix on parsing date/times.

OK.

>
> > test=# select to_date( '2000-1112', 'YYYY-MMDD');
> > to_date
> > ------------
> > 2000-11-12
> > > Looks like you've rooted out a number of problems in to_date (which
> > > in fairness is new-in-7.0 code). I've cc'd this to to_date's author,
> > > whom I hope will find a fix for 7.1.
> > How? Create limit for YYYY to 4-digits?
>
> Solved with heuristics.
>
> > > BTW, direct conversion to timestamp does something sensible in all
> > > these cases:
> > > regression=# select '001112'::timestamp;
> > > ?column?
> > > ------------------------
> > > 2000-11-12 00:00:00-05
> > > (1 row)
> > Why not 1900?
>
> Because of the common and documented cutoff date (1970 currently, 1950
> in some other apps) used to solve this problem.
>
> > test=# select to_char('1900-11-12'::date, 'YYDDMM')::timestamp;
> > ?column?
> > ------------------------
> > 2000-12-11 00:00:00+01
> > (1 row)
> > What is right here? Really '00' = 2000? .... but input is 1900
>
> That is operator error; converting a year outside the cutoff range to a
> string and then converting it back to a date/time type gets you what you
> deserve for using two-digit years. (I know you did this for
> illustration, but two digit years can be dangerous, as you are pointing
> out).
>
> > test=# select '200001112'::timestamp;
> > ERROR: Bad timestamp external representation '200001112'
> > ???
>
> I was going to claim that the "long year" is covered in the existing
> heuristics, but I'll now claim that rejecting arbitrarily long,
> undelimited ISO dates is preferred ;)
>
> > Well, possible solution: to_timestamp/date see if in the input is
> > some separator (non-digit char) between YYYY and next template (like DD),
> > if separator not here to_date() will expect 4-digits year.
> > '20001112' not separator --> 4-digits for year = 2000
> > '20000-11-12' separator '-' --> read it as 20000
> > '200001112' not separator --> 4-digits for year = 2000
> > --> 2-digits for month = 01
> > --> 2-digits for day = 11
> > --> last '2' is ignored
> > With 'YY' *hell* I not sure... add current year IMHO not must be
> > correct. I mean that correct solution is:
> > test=# select to_date('00-12-11', 'YY-DD-MM');
> > ERROR: Can't convert 'YY' to a real year.
> > But if users want for their Oracle->PG port use 'YY' as last two digits
> > in the current year, not problem make it....
>
> Karel, I can help polish the heuristics with you. That 1970/1950
> convention is something you can rely on.

I try "steal" and study your code from date/time routines and try
implement correct (mean like others PG routines) YY/YYY conversion
with 1970/1950. For 4-digits or "long years" I use separator analyze.

I don't worry, we have to_char/date already better than original
Oracle's to_char() :-)

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Lockhart 2000-11-12 16:10:14 Re: Re: to_date problems (Re: Favor for Postgres User at WSI)
Previous Message Thomas Lockhart 2000-11-11 17:50:13 Re: Re: to_date problems (Re: Favor for Postgres User at WSI)