Skip site navigation (1) Skip section navigation (2)

Re: Query seem to slow if table have more than 200 million rows

From: "Ahmad Fajar" <gendowo(at)konphalindo(dot)or(dot)id>
To: "'Qingqing Zhou'" <zhouqq(at)cs(dot)toronto(dot)edu>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Query seem to slow if table have more than 200 million rows
Date: 2005-09-27 09:39:55
Message-ID: SVONERLVbNDN4CxpAOA00000061@ki-communication.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hi Qingqing,

I don't know whether the statistic got is bad or good, this is the
statistic:
scooby=# select a.relid, a.relname, b.indexrelid, b.indexrelname,
c.idx_scan, c.idx_tup_read, c.idx_tup_fetch,
scooby-# a.heap_blks_read, a.heap_blks_hit, a.idx_blks_read, a.idx_blks_hit,
scooby-# a.toast_blks_read, a.toast_blks_hit, a.tidx_blks_read,
a.tidx_blks_hit, b.idx_blks_read, b.idx_blks_hit
scooby-# from pg_statio_user_tables a, pg_statio_user_indexes b,
pg_stat_all_indexes c
scooby-# where a.relid=b.relid and a.relid=c.relid and
b.indexrelid=c.indexrelid and a.relname=b.relname and
scooby-# a.relname=c.relname and a.relname='fti_dict1';
  relid   |  relname  | indexrelid | indexrelname | idx_scan | idx_tup_read
| idx_tup_fetch | heap_blks_read | heap_blks_hit | idx
_blks_read | idx_blks_hit | toast_blks_read | toast_blks_hit |
tidx_blks_read | tidx_blks_hit | idx_blks_read | idx_blks_hit
----------+-----------+------------+--------------+----------+--------------
+---------------+----------------+---------------+----
-----------+--------------+-----------------+----------------+--------------
--+---------------+---------------+--------------
 22880226 | fti_dict1 |   22880231 | idx_dict3    |        0 |            0
|             0 |              0 |             0 |
         0 |            0 |                 |                |
|               |             0 |            0
 22880226 | fti_dict1 |   22880230 | idx_dict2    |        7 |       592799
|        592799 |              0 |             0 |
         0 |            0 |                 |                |
|               |             0 |            0
 22880226 | fti_dict1 |   22880229 | idx_dict1    |        0 |            0
|             0 |              0 |             0 |
         0 |            0 |                 |                |
|               |             0 |            0
(3 rows)

I have try several time the query below with different keyword, but I just
got idx_tup_read and idx_tup_fetch changed, others keep zero. 
The Index are:
Ids (Idx_dict1), 
keywords (idx_dict2 varchar_ops),
keywords (idx_dict3 varchar_pattern_ops) ==> I use this index for query ...
keywords like 'blabla%', just for testing purpose

Regards,
ahmad fajar

-----Original Message-----
From: pgsql-performance-owner(at)postgresql(dot)org
[mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of Qingqing Zhou
Sent: Selasa, 27 September 2005 8:43
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] Query seem to slow if table have more than 200
million rows


""Ahmad Fajar"" <gendowo(at)konphalindo(dot)or(dot)id> wrote
>
> Select ids, keywords from dict where keywords='blabla' ('blabla' is a 
> single
> word);
>
> The table have 200 million rows, I have index the keywords field. On the
> first time my query seem to slow to get the result, about 15-60 sec to get
> the result. But if I repeat the query I will get fast result. My question 
> is
> why on the first time the query seem very slow.
>
> Table structure is quite simple:
>
> Ids bigint, keywords varchar(150), weight varchar(1), dpos int.
>

The first slowness is obviously caused by disk IOs. The second time is 
faster because all data pages it requires are already in buffer pool. 200 
million rows is not a problem for btree index, even if your client tool 
appends some spaces to your keywords at your insertion time, the ideal btree

is 5 to 6 layers high at most. Can you show the iostats of index from your 
statistics view? 
http://www.postgresql.org/docs/8.0/static/monitoring-stats.html#MONITORING-S
TATS-VIEWS

Regards,
Qingqing



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend


In response to

Responses

pgsql-performance by date

Next:From: Gnanavel SDate: 2005-09-27 09:50:05
Subject: PostgreSQL overall design
Previous:From: Andrey RepkoDate: 2005-09-27 09:14:31
Subject: Index not used on group by

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group