Re: Postgresql data archiving best practises

From: "L(dot) Loewe" <lloewe(at)hotmail(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: Postgresql data archiving best practises
Date: 2010-04-09 00:31:26
Message-ID: op.vavfuoxrb4gxma@oakcreek2
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On Wed, 07 Apr 2010 04:02:04 -0600, <machielr(at)rdc(dot)co(dot)za> wrote:

Do you build/design the database(s) or are you just responsible
for maintaining it?

I mean even if there was an easy way to know when a row was inserted
into a table - how do you know it's not needed in the future?
I don't think an employee would be happy if he was removed from
the db just because his data didn't change over the last 5 years.

It's up to the client to specify exactly what data should be
removed from the live db (aka archived) matching what criteria.

You can speed up the process of (re)moving data rows considerably
by making use of inheritance/partitioning - but in any case you need
a solid set of rules of when and what and properly designed tables.

Backup is a different story.
There are several ways to backup live databases - quite a few are
described in the postgresql docs
http://www.postgresql.org/docs/8.4/static/backup.html

As for dependencies:
There are the pg_depend and pg_shdepend catalogs that contain this
kind of information - however you need some smart queries or tools to
make that data easily human readable.
Also if you try to (begin;) drop table xyz; (rollback;)
postgresql will tell you if there are db objects depending on that table
- probably not the smartest way to get that information but
it'll work ;)

Keep in mind though that there is no way of knowing if some application
depends on a certain table - again that's the clients call.

HTH

> HI All
>
> Me again ;-(
>
>
> The second thing I need some info or assistance with is Data
> archiving practises.
>
>
> As per a previous post, our client have about 190 databases
> which will be consolidated and some databases removed.
>
> However one of our tasks is to do archiving of data.
>
> As far as we have come so far they need to keep current year's
> data + 5 years, and anything prior to this needs to be archived.
>
>
> I do not have that much dba experience yet, and the rest of our
> company focuses on Oracle.
>
> The plan is to make proper full backups of the databases and
> testing the backup restores before commencing.
>
> Then we need to archive the data and test this afterwards.
>
>
> However, we are concerned about the following:
>
> - is there a way in postgresql to determine the table dependencies?
> - In postgresql , how can we determine how old the data is (i.e.
> what can be archived and what falls within the criteria)... the tables
> mostly does not keep dates as part of the data itself so we need to
> determine how to check this.
>
>
> Once again all assistance is appreciated (from a learning postgresql
> dba)
>
>
> Regards
> Machiel
>
>

--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message peter 2010-04-11 12:31:07 Re: Incomplete pg_dump operation
Previous Message Mladen Gogala 2010-04-08 18:49:16 Re: FW: Postgres cpu & memory usage