Re: Creating complex track changes database - challenge!

From: Łukasz Jarych <jaryszek(at)gmail(dot)com>
To: Thiemo Kellner <thiemo(at)gelassene-pferde(dot)biz>, pgsql-general(at)postgresql(dot)org
Subject: Re: Creating complex track changes database - challenge!
Date: 2018-02-27 07:05:36
Message-ID: CAGv31ofRHVy36HE3dNaC3xSgsaJ9YXsYp5RVRba37B+Z=32F-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Thiemo,

you can share the repository, maybe when i will go more into PostgreSQL i
would help you.

*Regarding table versionig.*
I am thinking about simple solution:
1. Create query or trigger which will be checking last date of inputed data
within Table.
2. Export the table into seperate file/back up/structure using
Date_TableName.
3. Have a table where i would complete all history together in one place.

*Regarding Database Versioning*
Write script which will be exporting whole database into *.sql file when
administrator wants. And adding variable with description, for example
"Added new table to database".
Next commit this file into bitbucket automatically using bash script and
compare results via bitbucket.

What do you think ?

Best,
Jacek

2018-02-27 8:11 GMT+01:00 Thiemo Kellner <thiemo(at)gelassene-pferde(dot)biz>:

> Hi Lukasz
>
> I am working on a generic (reading the information schema and other
> database metadata), trigger based solution for SCD tables, i. e. tables
> that keep (or not according to SCD type) history of the data. However, it
> is not far grown and I am not having much time to advance it so it evolves
> very slowly. If you are interested, I would open a sourceforge project or
> the like and we can work on it together.
>
> I am very much surprised that no database I know of supports
> SCD/historising tables out of the box. In 16 years as ETL pro I have seen
> reinvented the wheel all the time... maybe PostgreSQL wants to get a head
> start on this.
>
> Kind regards
>
> Thiemo
>
>
>
> Zitat von ?ukasz Jarych <jaryszek(at)gmail(dot)com>:
>
> Hi Guys,
>>
>> I have idea already for creating this complex solution.
>>
>> Please give your notes and tips if you have.
>>
>> 1. Keep all changes within table including:
>> -adding rows
>> -deleting
>> -editing
>>
>> This can be managed by adding triggers and one additional table where you
>> can have sum up what was changed.
>>
>> 2. Changing DDL of tables:
>>
>> I think that creating trigger for metadata should solve the problem. How
>> can i do it? I do not know already ...:)
>>
>> 3. Changing tables versioning.
>>
>> It it is possible to save table (back up or something) to disc - i can
>> check the latest date of change and save table with this date and name.
>> And create table with all tables changes and version.
>> What do you think ?
>>
>> 4. Still problem with creating whole database versioning.
>> I found very interesting link but i not understand how it is works:
>>
>> https://odetocode.com/blogs/scott/archive/2008/02/02/version
>> ing-databases-change-scripts.aspx
>>
>> Best,
>> Jacek
>>
>> 2018-02-26 12:16 GMT+01:00 ?ukasz Jarych <jaryszek(at)gmail(dot)com>:
>>
>> Hi Manual,
>>> thank you very much!
>>>
>>> Regarding your tool - if it is not supported and it is for specific case
>>> - i will not use it but figure out something new. I do not even how to
>>> install this .hs files...
>>>
>>> I thought about creating triggers to have all changes to specific tables.
>>> And for each table name (or number) keep changes in one separate table.
>>> What do you think about it?
>>>
>>> If all you need is saving and restoring specific table states, logical
>>>
>>>> dumps with pg_dump should probably be enough for your needs.
>>>>
>>>
>>>
>>> Can you explain in details how can i use it?
>>> What if user add new column? I can save ma table for example as version 3
>>> and come back to version 1 in the future? (without this new column?)
>>>
>>> Best,
>>> Jacek
>>>
>>>
>>>
>>> 2018-02-26 12:04 GMT+01:00 Manuel Gómez <targen(at)gmail(dot)com>:
>>>
>>> On Mon, Feb 26, 2018 at 11:44 AM ?ukasz Jarych <jaryszek(at)gmail(dot)com>
>>>>
>>>> wrote:
>>>>
>>>> I have to:
>>>>>
>>>>> 1. Keep all changes within table including:
>>>>> -adding rows
>>>>> -deleting
>>>>> -editing
>>>>>
>>>>> 2. Save table with specific state and recover specific state (so go
>>>>> back
>>>>> to previous table versions) including comparing tables.
>>>>>
>>>>> 3. Track all DLL and DML changes with possibility to ho back to
>>>>> previous
>>>>> version.
>>>>>
>>>>>
>>>> Hi,
>>>>
>>>> I had similar needs long ago, so I wrote this tool I called Squealer,
>>>> which would transform a specification of a SQL database schema into some
>>>> PostgreSQL DDL to create a database that implements the same schema in a
>>>> logical sense, but actually stores all historical rows, and even permits
>>>> existing rows to have references to (soft-)deleted rows, all while
>>>> providing modifiable views that simulate the behavior of a regular
>>>> table as
>>>> specified in the input schema through generous use of INSTEAD OF
>>>> triggers.
>>>> It works somewhat like having version control for your database.
>>>>
>>>> You may find the source code here: https://github.com/mgomezch/squealer
>>>> Unfortunately, it has hardly any comments, it is completely unmaintained
>>>> and probably unused anywhere, I have no idea whether it builds with
>>>> today's
>>>> libraries, and it does not necessarily break the tradeoffs in this
>>>> space in
>>>> a way that fits your use case.
>>>>
>>>> Note there are major caveats with keeping all historical data around
>>>> forever, and the semantics of querying historical data can get
>>>> complicated,
>>>> let alone having current data refer to deleted, historical data. I
>>>> built
>>>> this for a very specific use case where this was the right design, but
>>>> please consider very carefully whether this is what you want.
>>>>
>>>> Storing your database history forever would take a lot of space.
>>>> Consider whether you can instead keep a record of changes stored outside
>>>> the database in some cheap cold storage. Also consider just keeping a
>>>> set
>>>> of tables with dynamically structured event records (e.g. JSON fields)
>>>> partitioned by time ranges and retained only temporarily, perhaps even
>>>> in a
>>>> separate database. Any such solution will have significant cost and
>>>> performance impact if your database bears a nontrivial load, so be
>>>> careful.
>>>>
>>>> You could also just place your database on a PostgreSQL cluster by
>>>> itself
>>>> and then keep all WAL segments archived forever, so you could just do
>>>> point-in-time recovery to any point in the history of your database.
>>>> The
>>>> space required would grow very quickly, though, so if you don't really
>>>> need
>>>> the full history forever, but only a fixed retention period, you can
>>>> surely
>>>> use any of the well-known solutions for PostgreSQL backups that allow
>>>> for
>>>> this through WAL archiving and periodic basebackups: e.g. WAL-E, WAL-G,
>>>> pgBackRest?
>>>>
>>>> If all you need is saving and restoring specific table states, logical
>>>> dumps with pg_dump should probably be enough for your needs.
>>>>
>>>>
>>>
>>>
>>
>
>
> --
> +49 (0)1578-772 37 37
> +41 (0)78 947 36 21
> Öffentlicher PGP-Schlüssel: http://pgp.mit.edu/pks/lookup?
> op=get&search=0x8F70EFD2D972CBEF
>
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.
>
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Thiemo Kellner 2018-02-27 07:11:26 Re: Creating complex track changes database - challenge!
Previous Message Łukasz Jarych 2018-02-27 06:25:30 Re: Creating complex track changes database - challenge!