Realistic upper bounds on table size

From: "A(dot)M(dot)" <agentm(at)cmu(dot)edu>
To: pgsql-admin(at)postgresql(dot)org
Subject: Realistic upper bounds on table size
Date: 2003-03-28 17:07:54
Message-ID: D10A4775-613F-11D7-9333-0030657192DA@cmu.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Sorry for the cross-posting but I wasn't able to elicit a response on
-general.

I'm trying to figure out what the upper bounds on a postgresql table
are based on required use of indices and integer columns in a single
table.
An astronomy institution I'm considering working for receives
a monster amount of telescope data from a government observatory. Each
day, they download millions of rows of data (including position in the
sky, infrared reading, etc.) in CSV format. Most of the rows are floats
and integers. I would like to offer them an improvement over their old
system.
I would like to know how postgresql does under such extreme
circumstances- for example, I may load the entire millions of rows CSV
file into a table and then eliminate some odd million rows they are not
interested in. Would a vacuum at this time be prohibitively expensive?
If I add some odd millions of rows to a table every day, can I expect
the necessary indices to keep up? In other words, will postgresql be
able to keep up with their simple and infrequent selects on monster
amounts of data (potentially 15 GB/day moving in and out daily with db
growth at ~+5 GB/day [millions of rows] in big blocks all at once)
assuming that they have top-of-the-line equipment for this sort of
thing (storage, memory, processors, etc.)? Anyone else using postgresql
on heavy-duty astronomy data? Thanks for any info.

><><><><><><><><><
AgentM
agentm(at)cmu(dot)edu

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Vilson farias 2003-03-28 19:22:51 backend closed the channel unexpectedly everytime!
Previous Message Peter Eisentraut 2003-03-28 13:56:35 Re: About sorting rows randomly