Re: truncating timestamps on arbitrary intervals

From: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
To: Artur Zakirov <zaartur(at)gmail(dot)com>
Cc: Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Fetter <david(at)fetter(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: truncating timestamps on arbitrary intervals
Date: 2020-04-02 09:22:31
Message-ID: CACPNZCvMcaw3zUnRf6VN2B7NRGh4YWp6bwj6Eb+7Gs=7yHBb9w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 31, 2020 at 4:34 PM Artur Zakirov <zaartur(at)gmail(dot)com> wrote:
> Thank you for new version of the patch.

Thanks for taking a look! Attached is v8, which addresses your points,
adds tests and fixes some bugs. There are still some WIP detritus in
the timezone code, so I'm not claiming it's ready, but it's much
closer. I'm fairly confident in the implementation of timestamp
without time zone, however.

> I'm not sure that I fully understand the 'origin' parameter. Is it valid
> to have a value of 'origin' which is greater than a value of 'timestamp'
> parameter?

That is the intention. The returned values should be

origin +/- (n * interval)

where n is an integer.

> I get some different results in such case:
>
> =# select date_trunc_interval('2 year', timestamp '2020-01-16 20:38:40',
> timestamp '2022-01-17 00:00:00');
> date_trunc_interval
> ---------------------
> 2020-01-01 00:00:00

This was correct per how I coded it, but I have rethought where to
draw the bins for user-specified origins. I have decided that the
above is inconsistent with units smaller than a month. We shouldn't
"cross" the bin until the input has reached Jan. 17, in this case. In
v8, the answer to the above is

date_trunc_interval
---------------------
2018-01-17 00:00:00
(1 row)

(This could probably be better documented)

> =# select date_trunc_interval('3 year', timestamp '2020-01-16 20:38:40',
timestamp '2022-01-17 00:00:00');
> date_trunc_interval
> ---------------------
> 2022-01-01 00:00:00
>
> So here I'm not sure which result is correct.

This one is just plain broken. The result should always be equal or
earlier than the input. In v8 the result is now:

date_trunc_interval
---------------------
2019-01-17 00:00:00
(1 row)

> It seems that the patch is still in progress, but I have some nitpicking.
>
> > + <entry><literal><function>date_trunc_interval(<type>interval</type>, <type>timestamptz</type>, <type>text</type>)</function></literal></entry>
> > + <entry><type>timestamptz </type></entry>
>
> It seems that 'timestamptz' in both argument and result descriptions
> should be replaced by 'timestamp with time zone' (see other functions
> descriptions). Though it is okay to use 'timestamptz' in SQL examples.

Any and all nitpicks welcome! I have made these match the existing
date_trunc documentation more closely.

> timestamp_trunc_interval_internal() and
> timestamptz_trunc_interval_internal() have similar code. I think they
> can be rewritten to avoid code duplication.

I thought so too (and noticed the same about the existing date_trunc),
but it's more difficult than it looks.

Note: I copied some tests from timestamp to timestamptz with a few
tweaks. A few tz tests still don't pass. I'm not yet sure if the
problem is in the test, or my code. Some detailed review of the tests
and their results would be helpful.

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
v8-datetrunc_interval.patch application/octet-stream 38.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Verite 2020-04-02 09:23:49 Re: proposal \gcsv
Previous Message Amit Kapila 2020-04-02 09:02:07 Re: WAL usage calculation patch