Re: storing TZ along timestamps

From: Jim Nasby <jim(at)nasby(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stuart Bishop <stuart(at)stuartbishop(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Herrera Alvaro <alvherre(at)commandprompt(dot)com>
Subject: Re: storing TZ along timestamps
Date: 2011-07-18 22:29:51
Message-ID: AD161118-B0D6-4AEA-8703-0D55BF2E8611@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jul 18, 2011, at 12:29 AM, Robert Haas wrote:
> On Fri, Jul 8, 2011 at 11:13 AM, Stuart Bishop <stuart(at)stuartbishop(dot)net> wrote:
>> On Mon, Jun 6, 2011 at 7:50 AM, Jim Nasby <jim(at)nasby(dot)net> wrote:
>>> On Jun 4, 2011, at 3:56 AM, Greg Stark wrote:
>>>> On Thu, Jun 2, 2011 at 8:58 PM, Jim Nasby <jim(at)nasby(dot)net> wrote:
>>>>>
>>>>> I'm torn between whether the type should store the original time or the original time converted to GMT.
>>>>
>>>> This is the wrong way to think about it. We *never* store time
>>>> "converted to GMT". When we want to represent a point in time we
>>>> represent it as seconds since the epoch.
>>> Right. Sorry, my bad.
>>>
>>>> The question here is how to represent more complex concepts than
>>>> simply points in time. I think the two concepts under discussion are
>>>> a) a composite type representing a point in time and a timezone it
>>>> should be interpreted in for operations and display and b) the
>>>> original input provided which is a text string with the constraint
>>>> that it's a valid input which can be interpreted as a point in time.
>>>
>>> My fear with A is that something could change that would make it impossible to actually get back to the time that was originally entered. For example, a new version of the timezone database could change something. Though, that problem also exists for timestamptz today, so presumably if it was much of an issue we'd have gotten complaints by now.
>>
>> The common problem is daylight savings time being declared or
>> cancelled. This happens numerous times throughout the year, often with
>> short notice.
>>
>> If you want to store '6pm July 3rd 2014 Pacific/Fiji', and want that
>> to keep meaning 6pm Fiji time no matter what decisions the Fijian
>> government makes over the next two years, you need to store the
>> wallclock (local) time and the timezone. The wallclock time remains
>> fixed, but the conversion to UTC may float.
>>
>> If you are storing an point in time that remains stable no matter
>> future political decisions, you store UTC time and an offset. The
>> conversion to wallclock time may float, and your 6pm Fiji time meeting
>> might change to 5pm or 7pm depending on the policical edicts.
>
> This is a pretty good point. You would want the first of these
> (probably) for the time a meeting is scheduled to start, and the
> second for the time of a meteor shower (or some other natural event
> that doesn't care what the politicians decide).
>
> I feel like the second of these is pretty well handled by our existing
> timestamptz data type. Granted, you can't store the intended display
> time zone, but putting it in a separate column is not really a
> problem: at least, it has the right semantics. So maybe the first is
> the one we should be aiming at. If so, storing a counter and a time
> zone is the wrong approach: you need to record something like <year,
> month, day, hour, minute, second, tzname>.

Right; you need a timestamp and you need to know what timezone that timestamp was entered in. That means you can always convert that time to whatever timezone you'd like (like timestamptz), but you also know what time was originally entered, and what timezone it was entered in. Technically you can do that with a separate field, but that seems really ugly to me.

So what we're proposing is a new data type that stores a timestamp as well as the timezone that that time was originally entered in. We can't just store a 3 letter timezone abbreviation, because the mapping from 3 letter TZs to actual TZs is not fixed (see the timezone_abbreviations GUC). I believe the best way to handle this is a system table that stores the name of every timezone that the database has ever loaded from the timezone data files, along with an OID. That means that the storage for this is just a timestamp and an OID.
--
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2011-07-18 22:37:15 Re: storing TZ along timestamps
Previous Message Mark Kirkwood 2011-07-18 22:29:45 Re: Re: [COMMITTERS] pgsql: Add temp_file_limit GUC parameter to constrain temporary file sp