From: | Steve Wampler <swampler(at)noao(dot)edu> |
---|---|
To: | Postgres-performance <pgsql-performance(at)postgresql(dot)org> |
Subject: | Improving a simple query? |
Date: | 2003-07-13 18:05:15 |
Message-ID: | 1058119515.23012.114.camel@weaver.tuc.noao.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
I'm not an SQL or PostgreSQL expert.
I'm getting abysmal performance on a nested query and
need some help on finding ways to improve the performance:
Background:
RH 8.0 dual-CPU machine (1.2GHz athlon)
Postgresql 7.2
1GB ram
(Machine is dedicated to postgres, so there's
not much else running.)
The table has ~500K rows.
Table definition:
lab.devel.configdb=# \d attributes_table
Table "attributes_table"
Column | Type | Modifiers
--------+--------------------------+---------------
id | character varying(64) | not null
name | character varying(64) | not null
units | character varying(32) |
value | text |
time | timestamp with time zone | default now()
Indexes: id_index,
name_index
Primary key: attributes_table_pkey
Triggers: trigger_insert
View definition:
lab.devel.configdb=# \d attributes;
View "attributes"
Column | Type | Modifiers
--------+-----------------------+-----------
id | character varying(64) |
name | character varying(64) |
units | character varying(32) |
value | text |
View definition: SELECT attributes_table.id,
attributes_table.name, attributes_table.units,
attributes_table.value FROM attributes_table;
Query:
select * from attributes_table where id in (select id from
attributes where (name='obsid') and (value='oid00066'));
Now, the inner SELECT is fast:
lab.devel.configdb=# explain analyze select id from attributes
where (name='obsid') and (value='oid00066');
NOTICE: QUERY PLAN:
Index Scan using name_index on attributes_table
(cost=0.00..18187.48 rows=15 width=25)
(actual time=0.33..238.06 rows=2049 loops=1)
Total runtime: 239.28 msec
EXPLAIN
But the outer SELECT insists on using a sequential scan [it should
pick up about 20K-40K rows (normally, access is through a
script].
How slow? Slow enough that:
explain analyze select * from attributes_table where id in
(select id from attributes where (name='obsid') and
(value='oid00066'));
hasn't completed in the last 15 minutes.
Removing the analyze gives:
lab.devel.configdb=# explain select * from attributes_table where
id in (select id from attributes where (name='obsid') and
(value='oid00066'));
NOTICE: QUERY PLAN:
Seq Scan on attributes_table
(cost=100000000.00..8873688920.07 rows=241201 width=59)
SubPlan
-> Materialize (cost=18187.48..18187.48 rows=15 width=25)
-> Index Scan using name_index on attributes_table
(cost=0.00..18187.48 rows=15 width=25)
EXPLAIN
Obviously, something is forcing the outer select into a
sequential scan, which is what I assume is the bottleneck
(see above about lack of expert-ness...).
I've played with the settings in postgresql.conf, using
the on-line performance tuning guide:
shared_buffers = 8192 # 2*max_connections, min 16
max_fsm_relations = 1000 # min 10, fsm is free space map
max_fsm_pages = 10000 # min 1000, fsm is free space map
max_locks_per_transaction = 128 # min 10
wal_buffers = 64 # min 4
sort_mem = 128 # min 32
vacuum_mem = 4096 # min 1024
wal_files = 32 # range 0-64 (default was 0)
effective_cache_size = 96000 # default in 8k pages
random_page_cost = 3
but haven't noticed an significant change with these settings
over more conservative settings.
Any suggestions? Is there a better way to phrase the query
that would provide order-of-magnitude improvement?
Thanks!
Steve
--
Steve Wampler -- swampler(at)noao(dot)edu
Quantum materiae materietur marmota monax si marmota
monax materiam possit materiari?
From | Date | Subject | |
---|---|---|---|
Next Message | Richard Huxton | 2003-07-13 19:09:17 | Re: Improving a simple query? |
Previous Message | Joe Conway | 2003-07-13 17:30:43 | Re: Pgsql - Red Hat Linux - VS MySQL VS MSSQL |