Skip site navigation (1) Skip section navigation (2)

Re: very large db performance question

From: LIANHE SHAO <lshao2(at)jhmi(dot)edu>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: very large db performance question
Date: 2003-11-26 22:46:24
Message-ID: 4e2c4d4e061f.4e061f4e2c4d@jhmimail.jhmi.edu (view raw or flat)
Thread:
Lists: pgsql-performance
Thanks for reply. Actually our database only supply
some scientists to use (we predict that). so there
is no workload problem. there is only very
infrequent updates. the query is not complex. the
problem is, we have one table that store most of the
data ( with 200 million rows). In this table, there
is a text column which we need to do full text
search for each row. The result will then join the
data from another table which has 30,000 rows. Now
the query runs almost forever. 

I tried a small table with 2 million rows using  the
following simple command, it takes me about 6
seconds to get the result back. So, I get confused.
That is why I ask: Is it the hardware problem or
something else. (I just vacuumed the whole database
yesterday). 
 
PGA=> select count (*) from expressiondata ;
  count
---------
 2197497
(1 row)


PGA=> explain select count (*) from expressiondata ;
                                  QUERY PLAN
------------------------------------------------------------------------------
 Aggregate  (cost=46731.71..46731.71 rows=1 width=0)
   ->  Seq Scan on expressiondata 
(cost=0.00..41237.97 rows=2197497 width=0)
(2 rows)

 
 
Regards, 
William

----- Original Message -----
From: Neil Conway <neilc(at)samurai(dot)com>
Date: Wednesday, November 26, 2003 10:03 pm
Subject: Re: [PERFORM] very large db performance
question

> LIANHE SHAO <lshao2(at)jhmi(dot)edu> writes:
> > We will have a very large database to store
microarray data (may
> > exceed 80-100G some day). now we have 1G RAM, 2G
Hz Pentium 4, 1
> > CPU. and enough hard disk.
> 
> > Could anybody tell me that our hardware is an
issue or not?
> 
> IMHO the size of the DB is less relevant than the
query workload. For
> example, if you're storying 100GB of data but only
doing a single
> index scan on it every 10 seconds, any modern
machine with enough HD
> space should be fine.
> 
> If you give us an idea of the # of queries you
expect per second, the
> approximate mix of reads and writes, and some idea
of how complex the
> queries are, we might be able to give you some
better advice.
> 
> -Neil
> 
> 
> 


Responses

pgsql-performance by date

Next:From: Tom LaneDate: 2003-11-27 00:09:25
Subject: Re: Followup - expression (functional) index use in joins
Previous:From: Hannu KrosingDate: 2003-11-26 22:32:30
Subject: Re: cross table indexes or something?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group