Skip site navigation (1) Skip section navigation (2)

Re: 4G row table?

From: Ron Johnson <ron(dot)l(dot)johnson(at)cox(dot)net>
To: PgSQL Performance ML <pgsql-performance(at)postgresql(dot)org>
Subject: Re: 4G row table?
Date: 2002-12-19 19:36:36
Message-ID: 1040326596.28772.186.camel@haggis (view raw or flat)
Thread:
Lists: pgsql-performance
On Thu, 2002-12-19 at 13:10, george young wrote:
> [linux, 700MHz athlon, 512MB RAM, 700GB 10kRPM SCSI HW RAID, postgresql 7.2]
> We're setting up a DB of IC test data, which is very simple and regular, but large.
> One project (we get three or four per year) has ~4 giga bits, each specified by
> a few small integer values, e.g.:
>  Name       Type Values
>  ----------------------
>  wafer      int	 1-50
>  die_row    int  2-7
>  die_col    int  2-7
>  testtype   string (~10 different short strings)
>  vdd        int  0-25
>  bit_col    int  0-127
>  bit_row    int  0-511
>  value      bit  0 or 1
> 
> with 4 billion(4e9) rows.  I would guess to make wafer, die_row, etc. be of
> type "char", probably testtype a char too with a separate testtype lookup table.
> Even so, this will be a huge table.  

How many records per day will be inserted?

Will they ever be updated?

Do you have to have *ALL* 4 billion records in the same table at the
same time?  As Josh Berkus mentioned, wafer thru bit_col can be
converted to INT2, if you make testtype use a lookup table; thus, each
tuple could be shrunk to 20 bytes, plus 24 bytes per tuple (in v7.3)
that would make the table a minimum of 189 billion bytes, not
including index!!!

Rethink your solution...

One possibility would to have a set of tables, with names like:
TEST_DATA_200301
TEST_DATA_200302
TEST_DATA_200303
TEST_DATA_200304
TEST_DATA_200305
TEST_DATA_200306
TEST_DATA_200307
TEST_DATA_<etc>

Then, each month do "CREATE VIEW TEST_DATA AS TEST_DATA_yyyymm" for the
current month.


> Questions: How much overhead will there be in the table in addition to the
> 9 bytes of data I see?  How big will the primary index on the first seven columns
> be?  Will this schema work at all? 
> 
> Of course, we could pack 128 bits into an 8 byte "text" field (or should we use bit(128)?),
> but lose some ease of use, especially for naive (but important) users.
> 
> Comments, suggestions?
> 
> -- George
>  
-- 
+---------------------------------------------------------------+
| Ron Johnson, Jr.        mailto:ron(dot)l(dot)johnson(at)cox(dot)net          |
| Jefferson, LA  USA      http://members.cox.net/ron.l.johnson  |
|                                                               |
| "My advice to you is to get married: If you find a good wife, |
| you will be happy; if not, you will become a philosopher."    |
|    Socrates                                                   |
+---------------------------------------------------------------+


In response to

pgsql-performance by date

Next:From: Tom LaneDate: 2002-12-19 22:46:20
Subject: Re: EXISTS vs IN vs OUTER JOINS
Previous:From: jasiekDate: 2002-12-19 19:28:30
Subject: Re: EXISTS vs IN vs OUTER JOINS

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group