From: | "Ahmad Fajar" <gendowo(at)konphalindo(dot)or(dot)id> |
---|---|
To: | "'Qingqing Zhou'" <zhouqq(at)cs(dot)toronto(dot)edu> |
Cc: | <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Query seem to slow if table have more than 200 million rows |
Date: | 2005-09-27 09:39:55 |
Message-ID: | SVONERLVbNDN4CxpAOA00000061@ki-communication.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi Qingqing,
I don't know whether the statistic got is bad or good, this is the
statistic:
scooby=# select a.relid, a.relname, b.indexrelid, b.indexrelname,
c.idx_scan, c.idx_tup_read, c.idx_tup_fetch,
scooby-# a.heap_blks_read, a.heap_blks_hit, a.idx_blks_read, a.idx_blks_hit,
scooby-# a.toast_blks_read, a.toast_blks_hit, a.tidx_blks_read,
a.tidx_blks_hit, b.idx_blks_read, b.idx_blks_hit
scooby-# from pg_statio_user_tables a, pg_statio_user_indexes b,
pg_stat_all_indexes c
scooby-# where a.relid=b.relid and a.relid=c.relid and
b.indexrelid=c.indexrelid and a.relname=b.relname and
scooby-# a.relname=c.relname and a.relname='fti_dict1';
relid | relname | indexrelid | indexrelname | idx_scan | idx_tup_read
| idx_tup_fetch | heap_blks_read | heap_blks_hit | idx
_blks_read | idx_blks_hit | toast_blks_read | toast_blks_hit |
tidx_blks_read | tidx_blks_hit | idx_blks_read | idx_blks_hit
----------+-----------+------------+--------------+----------+--------------
+---------------+----------------+---------------+----
-----------+--------------+-----------------+----------------+--------------
--+---------------+---------------+--------------
22880226 | fti_dict1 | 22880231 | idx_dict3 | 0 | 0
| 0 | 0 | 0 |
0 | 0 | | |
| | 0 | 0
22880226 | fti_dict1 | 22880230 | idx_dict2 | 7 | 592799
| 592799 | 0 | 0 |
0 | 0 | | |
| | 0 | 0
22880226 | fti_dict1 | 22880229 | idx_dict1 | 0 | 0
| 0 | 0 | 0 |
0 | 0 | | |
| | 0 | 0
(3 rows)
I have try several time the query below with different keyword, but I just
got idx_tup_read and idx_tup_fetch changed, others keep zero.
The Index are:
Ids (Idx_dict1),
keywords (idx_dict2 varchar_ops),
keywords (idx_dict3 varchar_pattern_ops) ==> I use this index for query ...
keywords like 'blabla%', just for testing purpose
Regards,
ahmad fajar
-----Original Message-----
From: pgsql-performance-owner(at)postgresql(dot)org
[mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of Qingqing Zhou
Sent: Selasa, 27 September 2005 8:43
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] Query seem to slow if table have more than 200
million rows
""Ahmad Fajar"" <gendowo(at)konphalindo(dot)or(dot)id> wrote
>
> Select ids, keywords from dict where keywords='blabla' ('blabla' is a
> single
> word);
>
> The table have 200 million rows, I have index the keywords field. On the
> first time my query seem to slow to get the result, about 15-60 sec to get
> the result. But if I repeat the query I will get fast result. My question
> is
> why on the first time the query seem very slow.
>
> Table structure is quite simple:
>
> Ids bigint, keywords varchar(150), weight varchar(1), dpos int.
>
The first slowness is obviously caused by disk IOs. The second time is
faster because all data pages it requires are already in buffer pool. 200
million rows is not a problem for btree index, even if your client tool
appends some spaces to your keywords at your insertion time, the ideal btree
is 5 to 6 layers high at most. Can you show the iostats of index from your
statistics view?
http://www.postgresql.org/docs/8.0/static/monitoring-stats.html#MONITORING-S
TATS-VIEWS
Regards,
Qingqing
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
From | Date | Subject | |
---|---|---|---|
Next Message | Gnanavel S | 2005-09-27 09:50:05 | PostgreSQL overall design |
Previous Message | Andrey Repko | 2005-09-27 09:14:31 | Index not used on group by |