Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: David Steele <david(at)pgmasters(dot)net>
Cc: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)
Date: 2016-03-17 15:55:16
Message-ID: CAKFQuwZgVZ5n4Ba1aYS_ek-hoA1iEyP+z33RERRRkVx7tigqXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 17, 2016 at 8:41 AM, David Steele <david(at)pgmasters(dot)net> wrote:

> On 3/17/16 11:30 AM, David G. Johnston wrote:
> > On Thu, Mar 17, 2016 at 7:57 AM, Corey Huinker <corey(dot)huinker(at)gmail(dot)com
> > <mailto:corey(dot)huinker(at)gmail(dot)com>>wrote:
> >
> > On Thu, Mar 17, 2016 at 10:00 AM, David Steele <david(at)pgmasters(dot)net
> > <mailto:david(at)pgmasters(dot)net>> wrote:
> >
> > On 3/17/16 4:49 AM, Dean Rasheed wrote:
> >
> > > On 16 March 2016 at 23:32, David Steele <david(at)pgmasters(dot)net
> <mailto:david(at)pgmasters(dot)net>> wrote:
> > >
> > >>
> > >> I think in this case it comes down to a committer's judgement
> so I have
> > >> marked this "ready for committer" and passed the buck on to
> Álvaro.
> > >
> > > So I was pretty much "meh" on this patch too, because I'm not
> > > convinced it actually saves much typing, if any.
> > >
> > > However, I now realise that it introduces a
> backwards-compatibility
> > > breakage. Today it is possible to type
> > >
> > > SELECT * FROM generate_series('01-01-2000'::date,
> '01-04-2000', '7 days');
> >
> > It can also be broken as below and this is even scarier to me:
> >
> >
> > Above and below are the same query​...
>
> Not sure I agree. My point was that even if developers were pretty
> careful with their casting (or are using two actual dates) then there's
> still possibility for breakage.
>

With the first argument casted to date it doesn't matter whether you cast
the second argument as the pseudo-type anyelement will take its value from
the first argument and force the second to date.

The main problem is that the system tries to cast unknown to integer based
upon finding gs(date, date, integer) and fails without ever considering
that gs(ts, ts, interval) would succeed.

> > postgres=# SELECT * FROM generate_series('01-01-2000'::date,
> > '01-04-2000'::date, '7 days');
> > ERROR: invalid input syntax for integer: "7 days"
> > LINE 1: ...te_series('01-01-2000'::date, '01-04-2000'::date, '7
> > days');
> > <...>
> >
> > I see two ways around this:
> >
> > 1. Drop the step parameter entirely. My own use cases only ever
> > require the step values 1 and -1, and which one the user wants can
> > be inferred from (start < end). This would still change the output
> > type where a person wanted timestamps, but instead input two dates.
> >
> > ​Undesirable.​
>
> Very undesirable. Week intervals are a very valid use case and I don't
> like the automagic interval idea.
>
> >
> > 2. Rename the function date_series() or generate_series_date()
> >
> > I still think this is an important function because at the last
> > several places I've worked, I've found myself manufacturing a query
> > where some event data is likely to have date gaps, but we want to
> > see the date gaps or at least have the 0 values contribute to a
> > later average aggregate.
> >
> >
> > ​I'd call it "generate_dates(...)" and be done with it.
> >
> > We would then have:
> >
> > generate_series()
> > generate_subscripts()
> > generate_dates()
>
> To me this completely negates the idea of this "just working" which is
> why it got a +1 from me in the first place. If I have to remember to
> use a different function name then I'd prefer to just cast on the
> timestamp version of generate_series().
>
>
​So let the user decide whether to trade-off learning a new function name
but getting dates instead of timestamps or going through the hassle of
casting.​

For me, it would have been a nice bonus if the generate_series() could be
used directly but I feel that the desired functionality is desirable and
the name itself is of only minor consideration.

I can see that newbies might ponder why two functions exist but they should
understand "backward compatibility".

I suspect that most people will simply think: "use generate_series for
numbers and use generate_dates for, well, dates". The ones left out in the
cold are those wondering why they use "generate_series" for timestamp
series with a finer precision than date. I'd be willing to offer a
"generate_timestamps" to placate them. And might as well toss in
"generate_numbers" for completeness - though now I likely doom the
proposal...so I'll stick with just add generate_dates so the behavior is
possible without superfluous casting or the need to write a custom
function. Dates are too common a unit of measure to leave working with
them this cumbersome.

David J.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2016-03-17 16:03:49 Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)
Previous Message David Steele 2016-03-17 15:41:18 Re: Re: Add generate_series(date,date) and generate_series(date,date,integer)