Re: Possibilities of PgSQL

From: Shane Ambler <pgsql(at)Sheeky(dot)Biz>
To: Karel Břinda <konference(at)brinda(dot)info>
Cc: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Possibilities of PgSQL
Date: 2007-08-24 12:42:38
Message-ID: 46CED23E.3030100@Sheeky.Biz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Karel Břinda wrote:
>> 100MB of text in each row?? is each row a copy of the english
> dictionary?
>
> No, that was only example.
>
> It will be web application (Python + PgSQL). I thought about table with
> records + table with changes but is it some easy way how to get
> differences between 2 strings?
>

Using a trigger will be the most reliable way for you to go. Unless you
can find one that fits your needs (or gets you close) you will need to
build your own.

There is no built in functions that meet your needs.

A quick search at pgfoundry shows Auditable Postgresql
http://pgfoundry.org/projects/aupg/
this seems to have been started a while ago and been left but you may
find it to be a good starting point. The developer that started it may
even be able to help you to get it to meet your needs.

How precise do you want your changes to be recorded? Just each character
that has changed or the entire line that has changed?

If the text in you are storing is say 200-2000 lines of 100-200
characters per line then recording the line before and after the changes
may be fine.

If you are developing with python then look at using python for the
trigger function see
http://www.postgresql.org/docs/8.1/interactive/plpython.html and the
next couple of pages to get you started.

Either you write a trigger that asks another program to generate the
diff info for you that you can then record in your history table or you
write a function that goes through line by line (or character by
character) and records the differences.

>
> Shane Ambler pí¹e v Èt 23. 08. 2007 v 22:21 +0930:
>> Karel Bøinda wrote:
>>> Hi,
>>>
>>> I have a question about PgSQL. I am working at some project and I
> want
>>> to have few tables with special properties:
>>>
>>> There would be many rows and these would be changed every day for
> many
>>> times. I want to be able to get know how did the row looked in given
>>> time (f.e. 2 day ago, 6 minutes ago, 2nd Sept. 2002,...).
> Nevertheless
>>> it should work with diffs (f.e. if I had a row with 100MB string and
> I
>>> changed only 2 chars it should not require on disk 200MB but only
> 100MB
>>> + few (kilo)Bytes).
>> 100MB of text in each row?? is each row a copy of the english
> dictionary?
>> My thoughts are to have a table that records the changes, something
>> along the lines of a timestamp of the change with substring start and
>> end positions of the change with the before and after text that has
>> changed that can be used to 'replay' the changes.
>>
>> When you want to rewind to a certain time you find all entries after
> the
>> given time and undo the changes to get back to where it was.
>>
>>> Is there any possibility how to solve it? I have heard that I can do
> it
>>> with triggers (but the worst thing is how to implement diffs) but I
> hope
>>> that there is any other (easier) way.
>> Triggers are the most reliable way to implement this, I would think
> that
>> pl/perl may be the fastest way to implement it as a trigger.
>>
>> Basically you would need to compare before and after as the record
> was
>> saved - doing that with 100MB of data will make saving changes
> somewhat
>> slower.
>>
>> What sort of client is updating this data?
>> I would think that the client could keep track of changes as they are
>> typed and save this info with the updated data thus removing the time
> to
>> compare the before and after data at the DB end.
>>
>>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

--

Shane Ambler
pgSQL(at)Sheeky(dot)Biz

Get Sheeky @ http://Sheeky.Biz

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Arnau 2007-08-24 13:25:48 PostgreSQL and virtualization
Previous Message Peter Eisentraut 2007-08-24 11:59:43 Re: tar, but not gnu tar