Re: multi billion row tables: possible or insane?

From: Markus Schaber <schabios(at)logi-track(dot)com>
To: John Arbash Meinel <john(at)arbash-meinel(dot)com>
Cc: Ramon Bastiaans <bastiaans(at)sara(dot)nl>, pgsql-performance(at)postgresql(dot)org
Subject: Re: multi billion row tables: possible or insane?
Date: 2005-03-01 16:26:48
Message-ID: 422497C8.4040109@logi-track.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi, John,

John Arbash Meinel schrieb:

>> I am doing research for a project of mine where I need to store
>> several billion values for a monitoring and historical tracking system
>> for a big computer system. My currect estimate is that I have to store
>> (somehow) around 1 billion values each month (possibly more).
>>
> If you have that 1 billion perfectly distributed over all hours of the
> day, then you need 1e9/30/24/3600 = 385 transactions per second.

I hope that he does not use one transaction per inserted row.

In your in-house tests, we got a speedup factor of up to some hundred
when bundling rows on insertions. The fastest speed was with using
bunches of some thousand rows per transaction, and running about 5
processes in parallel.

Regard the usual performance tips: Use a small, but fast-writing RAID
for transaction log (no RAID-5 or RAID-6 variants), possibly a mirroring
of two harddisk-backed SSD. Use different disks for the acutal data
(here, LVM2 with growing volumes could be very handy). Have enough RAM.
Use a fast file system.

BTW, as you read about the difficulties that you'll face with this
enormous amount of data: Don't think that your task will much be easier
or cheaper using any other DBMS, neither commercial nor open source. For
all of them, you'll need "big iron" hardware, and a skilled team of
admins to set up and maintain the database.

Markus

--
markus schaber | dipl. informatiker
logi-track ag | rennweg 14-16 | ch 8001 zürich
phone +41-43-888 62 52 | fax +41-43-888 62 53
mailto:schabios(at)logi-track(dot)com | www.logi-track.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Sven Willenberger 2005-03-01 16:27:52 Re: Inheritence versus delete from
Previous Message Alan Stange 2005-03-01 15:57:54 Re: multi billion row tables: possible or insane?