temporal support patch

From: Miroslav Šimulčík <simulcik(dot)miro(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: temporal support patch
Date: 2012-05-16 21:14:53
Message-ID: CAHRNM69mdOtkPtHAqp=J=rn-LAFLpxMeCnPFLo4FFsb2BBfRVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

as a part of my master's thesis I have created temporal support patch for
PostgreSQL. It enables the creation of special temporal tables with entries
versioning. Modifying operations (UPDATE, DELETE, TRUNCATE) on these tables
don't cause permanent changes to entries, but create new versions of them.
Thus user can easily get to the past states of the table.

Basic information on temporal databases can be found on
http://en.wikipedia.org/wiki/Temporal_database

In field of temporal databases, there are only proprietary solution
available. During the analysis I found these:
- IBM DB2 10 for z/OS
- Oracle 11g Workspace Manager
- Teradata Database 13.10

Primary goal of my work was the creation of opensource solution, that is
easy to use and is backward compatible with existing applications, so that
the change of the original tables to temporal ones, does not require
changes to applications that work with them. This patch is built on
standard SQL/Temporal with some minor modifications inspired by commercial
temporal database systems. Currently it only deals with transaction time
support.

Here is simple description on how it works:

1. user can create transaction time table using modified CREATE TABLE
command:

CREATE TABLE person(name varchar(50)) AS TRANSACTIONTIME;

This command automatically creates all objects required for transaction
time support:

List of relations
Schema | Name | Type | Owner
--------+----------------------+----------+----------
public | person | table | tester
public | person__entry_id_seq | sequence | tester
public | person_hist | table | postgres

Table "public.person"
Column | Type |
Modifiers

------------+-----------------------------+------------------------------------------------------------------------------
name | character varying(50) |
_entry_id | bigint | not null default
nextval('person__entry_id_seq'::regclass)
_sys_start | timestamp without time zone | not null default
clock_timestamp()
_sys_end | timestamp without time zone | not null default
'294276-12-31 23:59:59.999999'::timestamp without time zone
Indexes:
"person__entry_id_idx" btree (_entry_id)
"person__sys_start__sys_end_idx" btree (_sys_start, _sys_end)

Table "public.person_hist"
Column | Type | Modifiers
------------+-----------------------------+-----------
name | character varying(50) |
_entry_id | bigint | not null
_sys_start | timestamp without time zone | not null
_sys_end | timestamp without time zone | not null
Indexes:
"person_hist__entry_id_idx" btree (_entry_id)
"person_hist__sys_start__sys_end_idx" btree (_sys_start, _sys_end)

Table person stores current versions of entries. 3 additional columns
are added:
_entry_id - id of entry. It groups together different versions of
entry.
_sys_start - beginning of the version validity period (version
creation timestamp).
_sys_end - end of the version validity period.

Table person_hist stores historical versions of entries. It has the
same structure and indexes as the person table, but without any constraints
and default values.

2. another way of creating transaction time table is adding transaction
time support to existing standard table using ALTER command.

CREATE TABLE person(name varchar(50));
ALTER TABLE person ADD TRANSACTIONTIME;

3. INSERT entry

INSERT INTO person VALUES('Jack');

SELECT *, _entry_id, _sys_start, _sys_end FROM person;

name | _entry_id | _sys_start | _sys_end

------+-----------+----------------------------+------------------------------
Jack | 1 | 2012-05-16 22:11:39.856916 | 294276-12-31
23:59:59.999999

4. UPDATE entry

UPDATE person SET name = 'Tom';

SELECT *, _entry_id, _sys_start, _sys_end FROM person;

name | _entry_id | _sys_start | _sys_end

------+-----------+----------------------------+------------------------------
Tom | 1 | 2012-05-16 22:11:44.736195 | 294276-12-31
23:59:59.999999

SELECT * FROM person_hist;

name | _entry_id | _sys_start | _sys_end

------+-----------+----------------------------+----------------------------
Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16
22:11:44.736194

5. DELETE entry

DELETE FROM person;

SELECT *, _entry_id, _sys_start, _sys_end FROM person;

name | _entry_id | _sys_start | _sys_end
------+-----------+------------+----------

SELECT * FROM person_hist;

name | _entry_id | _sys_start | _sys_end

------+-----------+----------------------------+----------------------------
Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16
22:11:44.736194
Tom | 1 | 2012-05-16 22:11:44.736195 | 2012-05-16
22:14:33.875869

6. selecting entries

INSERT INTO person VALUES('Mike');
INSERT INTO person VALUES('Mike');

--standard SELECT - operates only with current versions of entries
SELECT * FROM person;

name
------
Mike
Mike

--special temporal SELECT which operates with all versions
NONSEQUENCED TRANSACTIONTIME SELECT *, _entry_id, _sys_start, _sys_end
FROM person;

name | _entry_id | _sys_start | _sys_end

------+-----------+----------------------------+------------------------------
Mike | 3 | 2012-05-16 22:20:55.055671 | 294276-12-31
23:59:59.999999
Mike | 2 | 2012-05-16 22:20:51.619475 | 294276-12-31
23:59:59.999999
Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16
22:11:44.736194
Tom | 1 | 2012-05-16 22:11:44.736195 | 2012-05-16
22:14:33.875869

--special temporal SELECT which operates with versions valid in
specified time
TRANSACTIONTIME AS OF '2012-05-16 22:11:39.856916' SELECT *, _entry_id,
_sys_start, _sys_end FROM person;

name | _entry_id | _sys_start | _sys_end

------+-----------+----------------------------+----------------------------
Jack | 1 | 2012-05-16 22:11:39.856916 | 2012-05-16
22:11:44.736194

--it is also possible to set timestamp globally for session. All
subsequent SELECTs without any temporal modifier will operate with versions
valid in this time,
SET history_timestamp TO '2012-05-16 22:11:39.856916';

SELECT * FROM person;

name
------
Jack

--to select only current versions when history_tiumestamp is set,
CURRENT TRANSACTIONTIME have to be used with SELECT
CURRENT TRANSACTIONTIME SELECT * FROM person;

name
------
Mike
Mike

This is only a illustration of main functionality. Later I can create a
document about the design and implementation details, but first I need to
know if such temporal features as described here, could be added to future
versions of PostgreSQL, after meeting all the requirements of a new patch.

Regards

Miroslav Simulcik

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2012-05-16 21:30:27 Re: Draft release notes complete
Previous Message Stephen Frost 2012-05-16 20:53:51 Re: Pre-alloc ListCell's optimization