Re: historical log of data records

From: Sanjay Minni <sanjay(dot)minni(at)gmail(dot)com>
To: Alban Hertroys <haramrae(at)gmail(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: historical log of data records
Date: 2021-11-16 11:11:15
Message-ID: CAMpxBonpFbrnbXVV_9kEc3Wy4hYbRZoHTqgPyNA5C3a9JvsxRA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Alban,

Its a simple financial transaction processing application, the application
permits editing / updating / deleting of entered data even multiple times
but audit trail of the data tracing through all versions to its original
must be preserved.
(as outlined - Programmatically i could approach it by keeping a parallel
set of tables and copying the row being replaced into the parallel table
set, or, keeping all record versions in a single table only and a flag to
indicate the final / current version)
I am looking is there are better ways to do it

with warm regards
Sanjay Minni
+91-9900-902902

On Tue, 16 Nov 2021 at 15:57, Alban Hertroys <haramrae(at)gmail(dot)com> wrote:

>
> > On 16 Nov 2021, at 10:20, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
> >
> > On Tue, 2021-11-16 at 13:56 +0530, Sanjay Minni wrote:
> >> I need to keep a copy of old data as the rows are changed.
> >>
> >> For a general RDBMS I could think of keeping all the data in the same
> table with a flag
> >> to indicate older copies of updated / deleted rows or keep a parallel
> table and copy
> >> these rows into the parallel data under program / trigger control. Each
> has its pros and cons.
> >>
> >> In Postgres would i have to follow the same methods or are there any
> features / packages available ?
> >
> > Yes, I would use one of these methods.
> >
> > The only feature I can think of that may help is partitioning: if you
> have one partition
> > for the current data and one for the deleted data, then updating the
> flag would
> > automatically move the row between partitions, so you don't need a
> trigger.
>
> Are you building (something like) a data-vault? If so, keep in mind that
> you will have a row for every update, not just a single deleted row.
> Enriching the data can be really useful in such cases.
>
> For a data-vault at a previous employer, we determined how to treat new
> rows by comparing a (md5) hash of the new and old rows, adding the hash and
> a validity interval to the stored rows. Historic data went to a separate
> table for each respective current table.
>
> The current tables “inherited” the PK’s from the tables on the source
> systems (this was a data-warehouse DB). Obviously that same PK can not be
> applied to the historic tables where there _will_ be duplicates, although
> they should be at non-overlapping validity intervals.
>
> Alternatively, since this is time-series data, it would probably be a good
> idea to store that in a way optimised for that. TimescaleDB comes to mind,
> or arrays as per Pavel’s suggestion at
> https://stackoverflow.com/questions/68440130/time-series-data-on-postgresql
> .
>
> Regards,
>
> Alban Hertroys
> --
> If you can't see the forest for the trees,
> cut the trees and you'll find there is no forest.
>
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Jan Wieck 2021-11-16 11:39:00 Re: reading this group other than thru mails
Previous Message Alban Hertroys 2021-11-16 10:27:54 Re: historical log of data records