Re: 10 TB database

From: Grzegorz Jaśkiewicz <gryzman(at)gmail(dot)com>
To: Artur <a_wronski(at)gazeta(dot)pl>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: 10 TB database
Date: 2009-06-15 12:29:48
Message-ID: 2f4958ff0906150529pd119314v84d06704288908e0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Jun 15, 2009 at 1:00 PM, Artur<a_wronski(at)gazeta(dot)pl> wrote:
> Hi!
>
> We are thinking to create some stocks related search engine.
> It is experimental project just for fun.
>
> The problem is that we expect to have more than 250 GB of data every month.
> This data would be in two tables. About 50.000.000 new rows every month.

Well, obviously you need to decrease size of it, by doing some
normalization than.
If some information is the same across table, stick it into separate
table, and assign id to it.

If you can send me sample of that data, I could tell you where to cut size.
I have that big databases under my wings, and that's where
normalization starts to make sens, to save space (and hence speed
things up).

> We want to have access to all the date mostly for generating user requesting
> reports (aggregating).
> We would have about 10TB of data in three years.

For that sort of database you will need partitioning for sure.

Napisz do mnie, to moge pomoc prywatnie, moze za niewielka danina ;)

--
GJ

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Whit Armstrong 2009-06-15 13:01:26 Re: 10 TB database
Previous Message Jasen Betts 2009-06-15 12:17:11 Re: running pg_dump from python