Re: journaling / time travel

From: phb07 <phb07(at)apra(dot)asso(dot)fr>
To: willybas(at)gmail(dot)com
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: journaling / time travel
Date: 2016-09-23 20:57:47
Message-ID: db3869ef-2696-fc79-cec4-85478dbc814e@apra.asso.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


> On 09/19/2016 08:48 AM, Willy-Bas Loos wrote:
> > Since records can
> > be changed afterwards, it has been argued that we should have
> > "journaling", meaning that every change to the data is saved in a
> > separate schema that holds a "journaling" copy of each table
>
> I don't think this is especially unusual. I've worked on similar
> projects that maintain history for regulatory reasons. I can't speak to
> your legal situation, but I don't think keeping history is a strange choice.
>
> Re the security argument: it seems to me it depends on whether you can
> restrict access to the journal while providing access to the normal
> tables. I guess journaling would help for some threats but not others,
> right? If regular users can't touch the journal, then I agree keeping
> one could make sense.
>
> Are you saying the journaling portion is 624GB on top of 1.1TB? Or that
> of the 1.1TB, 624GB of it is from journaling? Either way it doesn't seem
> like a massive cost to me.
>
> I don't think PITR is an appropriate solution to keeping a 10-year
> history of changes.
>
> It sounds like you're not looking for a green-field solution, but just
> trying to get perspective on what others are doing. Some resources for
> this that might help you:
>
> https://www.youtube.com/watch?v=TRgni5q0YM8
> https://github.com/arkhipov/temporal_tables (aka
> http://pgxn.org/dist/temporal_tables/)
> http://www.cs.arizona.edu/~rts/tdbbook.pdf
> <http://www.cs.arizona.edu/%7Erts/tdbbook.pdf> (also available in print)
> https://www.amazon.com/Bitemporal-Data-Practice-Tom-Johnston/dp/0124080677
>
> The two books contain multiple approaches to storing history, each with
> tradeoffs for ease-of-use, disk space, etc. Reading them might be
> overkill for you. If you want to choose one, I'd read Snodgrass. The
> first few chapters are tedious, but it gets better.
>
> Temporal databases are in a funny in-between zone where there is lots of
> research, but standard tools are fairly underdeveloped. Postgres
> recently added range types and exclusion constraints, which are
> important primitives for building a temporal system, but it is still a
> long way from SQL:2011, and SQL:2011 is itself a long way from
> everything you might want. One thing that seems lacking to me, even in
> the research, is how to handle DDL changes. You should be glad that you
> only care about audit history and not subject history too, because going
> bi-temporal is where you really cross over into lack of available tools,
> outside of a few commercial offerings. (Teradata has temporal support,
> using a Snodgrass-like approach that pre-dates the standard.)
>
> Paul
>
You may also have a look on the E-Maj extension. It is available for
instance here on pgxn <http://pgxn.org/dist/e-maj/>.
Its purpose is to record all updates on tables either to examine them or
to "rollback" them later. But it should be also usable for auditing.
The documentation in the package includes a detailed reference guide,
but also a presentation that allows to quickly have a view on the extention.
Even if you think E-Maj doesn't fit your need, feel free to give me a
feedback. It may give me thoughts for enhancements ;-)

Regards. Philippe.

Browse pgsql-general by date

  From Date Subject
Next Message Vitaly Burovoy 2016-09-23 21:12:51 Re: jsonb_set for nested new item?
Previous Message Tom van Tilburg 2016-09-23 20:34:32 inconsistent behaviour of set-returning functions in sub-query with random()