Re: Meridiem markers (was: [BUGS] Incorrect "invalid AM/PM string" error from to_timestamp)

From: "Brendan Jurd" <direvus(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Alex Hunsaker" <badalex(at)gmail(dot)com>
Subject: Re: Meridiem markers (was: [BUGS] Incorrect "invalid AM/PM string" error from to_timestamp)
Date: 2009-01-18 11:24:15
Message-ID: 37ed240d0901180324k7c07445dlcffb41eb25f17cca@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 27, 2008 at 4:25 AM, Brendan Jurd <direvus(at)gmail(dot)com> wrote:
> Currently, Postgres accepts four separate flavours for specifying
> meridiem markers, given by uppercase/lowercase and with/without
> periods:
>
> * am/pm
> * AM/PM
> * a.m./p.m.
> * A.M./P.M.

>
> I would go so far as to say that we should accept any of the 8 valid
> meridiem markers, regardless of which flavour is indicated by the
> formatting keyword.
>
> Day and month names already work this way. We don't throw an error if
> a user specifies a mixed-case month name like "Sep" but uses the
> uppercase formatting keyword "MON".

I've been thinking further about this lately, and whilst the month and
day name tokens aren't fussy about *case*, they do make a distinction
about *length*.

So, while MON will match "Sep", "SEP" and "sep" just fine, it will
have issues with "September" (it will match the first three characters
as "Sep" and then leave the remaining characters "tember" to bork up
the next token).

Likewise, MONTH will not match "Sep", it needs the full month name.

I think, for to_timestamp(), it's important that the user have a solid
idea of how many characters each formatting token wants to consume.
With the am/pm and bc/ad markers, we've got two possibilities for
length; without periods (2 characters) and with periods (4
characters). Having the 2-character token match against a 4-character
string might cause more confusion than convenience.

It may make more sense to keep the different lengths separate, so that
a 2-character token will match any of "am", "pm", "AM", "PM", and a
4-character token will match any of "a.m.", "p.m.", "A.M.", "P.M.".

Comments?

Cheers,
BJ

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Chernow 2009-01-18 14:18:29 Re: VARSIZE - why omit VARLEN?
Previous Message Grzegorz Jaskiewicz 2009-01-18 10:43:46 Re: Fixes for compiler warnings