Data warehousing using postgresql is it possible ?
Answer is yes, it is possible.
Postgresql has always been considered as not suitable for data
warehousing because of advanced database features present in it, like
transaction support , locking, etc.
But the fact is: We cannot decide performance of a database just by
reading about it, best way is to practically implement the scenario,
tune it and compare it with other databases.
I did just that and following are the points one should consider to
achieve data warehouse in postgresql.
Handling transactions and locks
-Optimum transaction size
-Parallelism to avoid locks
PostgreSQL parameter configuration
-Tweaking postgresql configuration parameters to best utilize
Compartmentalize data storage
-Making logically physical division of data on disk
Views and fact tables
-Frequency of create and update view and maintaining fact tables
Column Index planning
-Deciding use of index.
-Planning a query based on requested criteria
Hardware/Software properties that effect performance
-RAM, HDD properties and softwares that can increase /
decrease system performance
Case study: Open source SIEM Cyberoam-iview
Cyberoam-iView is a practically successful example of A Real time data
warehouse using unchanged postgresql 8.4 and files.
-Supports up to 60,000 Events/Sec. ( collect, parse, process, insert
into postgresql tables)
-1040 predefined reports (OLAP Aggregated cubes).
-Storage capacity: Aggregated cubes with granularity of 5 minutes for
1 week, cubes with granularity of 4 hours for 1 year, cubes with
granularity of 24 hours for 7 years. Fact table in compressed form.
-Response time < 30 seconds for most of the reports.
-Real time view update, Aggregation cubes updated every 5 min.
-Reports for any time range selection (up to 7 years) supported.
-Does not support ad hoc aggregated report generation from fact table,
All reports needs to be predefined at the time of deployment.
-Fact table need to be extracted and loaded before searching.
-Each cube is updated with top n significant aggregated records where
0 < n < 4001.
-->Minimum Hardware requirements :
Pentium P4 processor 3.0 G Hz
2 GB RAM
1 TB HDD , standard company with read write speed > 50 MBps
pgsql-admin by date
|Next:||From: Shruthi A||Date: 2009-11-02 13:26:47|
|Subject: Need help in enabling remote connection|
|Previous:||From: Hengky Lie||Date: 2009-11-02 05:18:45|
|Subject: Help ! Service fails to start|