Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'

From: Mark Lorenz <postgres(at)four-two(dot)de>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'
Date: 2019-12-20 08:38:16
Message-ID: ac2819032a84c356fe01e04113cb87dd@four-two.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

while preparing the patch for the Commitfest, I found a bug in the
to_char() function that is quite correlated with this issue:

SELECT to_char('1997-02-01'::date, 'YYYY-WW-D')

returns: 1997-05-7 -> which is ok, I believe. Feb, 1st was on Saturday,
so counting from Sundays, it was day 7 of week 5.

SELECT to_char('1997-02-03'::date, 'YYYY-WW-D')

returns: 1997-05-2 -> This cannot be. The input date is two days laters,
but the result is 5 days earlier. I'd expect 1997-06-2 as result, but
this occurs another week later:

SELECT to_char('1997-02-10'::date, 'YYYY-WW-D')

This is wrong, because this should be week 7 instead. On the other hand,
the ISO week formats work very well.

I'll have a look at the code and try to fix it in the patch as well.

Kind regards,
Mark

Am 2019-10-08 17:49, schrieb Mark Lorenz:
> Hi,
>
> I apologize for the mistake.
>
> For the mailing list correspondence I created this mail account. But I
> forgot to change the sender name. So, the "postgres" name appeared as
> sender name in the mailing list. I changed it.
>
> Kind regards,
> Mark/S-Man42
>
>> Hi,
>>
>> some days ago I ran into a problem with the to_date() function. I
>> originally described it on StackExchange:
>> https://dba.stackexchange.com/questions/250111/unexpected-behaviour-for-to-date-with-week-number-and-week-day
>>
>> The problem:
>>
>> If you want to parse a date string with year, week and day of week,
>> you can do this using the ISO week pattern: 'IYYY-IW-ID'. This works
>> as expected:
>>
>> date string | to_date()
>> ------------+------------
>> '2019-1-1' | 2018-12-31 -> Monday of the first week of the year
>> (defined as the week that includes the 4th of January)
>> '2019-1-2' | 2019-01-01
>> '2019-1-3' | 2019-01-02
>> '2019-1-4' | 2019-01-03
>> '2019-1-5' | 2019-01-04
>> '2019-1-6' | 2019-01-05
>> '2019-1-7' | 2019-01-06
>>
>> '2019-2-1' | 2019-01-07
>> '2019-2-2' | 2019-01-08
>>
>> But if you are trying this with the non-ISO pattern 'YYYY-WW-D', the
>> result was not expected:
>>
>> date string | to_date()
>> -------------------------
>> '2019-1-1' | 2019-01-01
>> '2019-1-2' | 2019-01-01
>> '2019-1-3' | 2019-01-01
>> '2019-1-4' | 2019-01-01
>> '2019-1-5' | 2019-01-01
>> '2019-1-6' | 2019-01-01
>> '2019-1-7' | 2019-01-01
>>
>> '2019-2-1' | 2019-01-08
>> '2019-2-2' | 2019-01-08
>>
>> As you can see, the 'D' part of the pattern doesn't influence the
>> resulting date.
>>
>> The answer of Laurenz Albe pointed to a part of the documentation, I
>> missed so far:
>>
>> "In to_timestamp and to_date, weekday names or numbers (DAY, D, and
>> related field types) are accepted but are ignored for purposes of
>> computing the result. The same is true for quarter (Q) fields."
>> (https://www.postgresql.org/docs/12/functions-formatting.html)
>>
>> So, I had a look at the relevant code part. I decided to try a patch
>> by myself. Now it works as I would expect it:
>>
>> date string | to_date()
>> -------------------------
>> '2019-1-1' | 2018-12-30 -> Sunday (!) of the first week of the year
>> (the first week is at the first day of year)
>> '2019-1-2' | 2018-12-31
>> '2019-1-3' | 2019-01-01
>> '2019-1-4' | 2019-01-02
>> '2019-1-5' | 2019-01-03
>> '2019-1-6' | 2019-01-04
>> '2019-1-7' | 2019-01-05
>>
>> '2019-2-1' | 2019-01-06
>> '2019-2-2' | 2019-01-07
>>
>> Furthermore, if you left the 'D' part, the date would be always set to
>> the first day of the corresponding week (in that case it is Sunday, in
>> contrast to the ISO week, which starts mondays).
>>
>> To be consistent, I added similar code for the week of month pattern
>> ('W'). So, using the pattern 'YYYY-MM-W-D' yields in:
>>
>> date string | to_date()
>> ---------------------------
>> '2018-12-5-1' | 2018-12-23
>> '2018-12-6-1' | 2018-12-30
>> '2019-1-1-1' | 2018-12-30 -> First day (Su) of the first week of the
>> first month of the year
>> '2019-2-2-1' | 2019-02-03 -> First day (Su) of the second week of
>> February
>> '2019-10-3-5' | 2019-10-17 -> Fifth day (Th) of the third week of
>> October
>>
>> If you left the 'D', it would be set to 1 as well.
>>
>> The code can be seen here:
>> https://github.com/S-Man42/postgres/commit/534e6bd70e23864f385d60ecf187496c7f4387c9
>>
>> I hope, keeping the code style of the surrounding code (especially the
>> ISO code) is ok for you.
>>
>> Now the questions:
>> 1. Although the ignorance of the 'D' pattern is well documented, does
>> the new behaviour might be interesting for you?
>> 2. Does it work as you'd expect it?
>> 3. Because this could be my very first contribution to the PostgreSQL
>> code base, I really want you to be as critical as possible. I am not
>> quite sure if I didn't miss something important.
>> 4. Currently something like '2019-1-8' does not throw an exception but
>> results in the same as '2019-2-1' (8th is the same as the 1st of the
>> next week). On the other hand, currently, the ISO week conversion
>> gives out the result of '2019-1-7' for every 'D' >= 7. I am not sure
>> if this is better. I think a consistent exception handling should be
>> discussed separately (date roll over vs. out of range exception vs.
>> ISO week behaviour)
>>
>> So far, I am very curious about your opinions!
>>
>> Kind regards,
>> Mark/S-Man42

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Adam Lee 2019-12-20 09:16:26 Re: Memory-Bounded Hash Aggregation
Previous Message Kyotaro Horiguchi 2019-12-20 08:29:21 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions