Re: database performance and query performance question

From: "Shea,Dan [CIS]" <Dan(dot)Shea(at)ec(dot)gc(dot)ca>
To: "'josh(at)agliodbs(dot)com'" <josh(at)agliodbs(dot)com>, "Shea,Dan [CIS]" <Dan(dot)Shea(at)ec(dot)gc(dot)ca>, pgsql-performance(at)postgresql(dot)org
Subject: Re: database performance and query performance question
Date: 2004-01-22 20:35:59
Message-ID: F2D63B916C88C14D9B59F93C2A5DD33F0B9119@cisxa.cis.ec.gc.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


Something that I do not understand is why if you use a valid_time =
'2004-01-22 00:00:00' the query will use the index but if you do a
valid_time > '2004-01-22 00:00:00' it does not use the index?
PWFPM_DEV=# explain analyze select * from forecastelement where valid_time >
date '2004-01-23'::date limit 10;
QUERY PLAN
----------------------------------------------------------------------------
------------------------------------------------------------
Limit (cost=0.00..3.82 rows=10 width=129) (actual
time=199550.388..199550.783 rows=10 loops=1)
-> Seq Scan on forecastelement (cost=0.00..2722898.40 rows=7131102
width=129) (actual time=199550.382..199550.757 rows=10 loops=1)
Filter: (valid_time > '2004-01-23 00:00:00'::timestamp without time
zone)
Total runtime: 199550.871 ms
(4 rows)

PWFPM_DEV=# explain analyze select * from forecastelement where valid_time =
date '2004-01-23'::date limit 10;

QUERY PLAN
----------------------------------------------------------------------------
----------------------------------------------------------------------------
--------
Limit (cost=0.00..18.76 rows=10 width=129) (actual time=176.141..276.577
rows=10 loops=1)
-> Index Scan using forecastelement_vrwi_idx on forecastelement
(cost=0.00..160770.98 rows=85707 width=129) (actual time=176.133..276.494
rows=10 loops=1)
Index Cond: (valid_time = '2004-01-23 00:00:00'::timestamp without
time zone)
Total runtime: 276.721 ms
(4 rows)

-----Original Message-----
From: Josh Berkus [mailto:josh(at)agliodbs(dot)com]
Sent: Thursday, January 22, 2004 3:01 PM
To: Shea,Dan [CIS]; pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] database performance and query performance
question

Dan,

> Should there be less columns in the index?
> How can we improve database performance?
> How should I improve my query?

Your query plan isn't the problem. It's a good plan, and a reasonably
efficient query. Under other circumstances, the SELECT DISTINCT with the
to_char could be a performance-killer, but it's not in that result set.

Overall, you're taking 9 seconds to scan 93 million records. Is this the
time
the first time you run the query, or the 2nd and successive times?

When did you last run VACUUM ANALYZE on the table? Have you tried
increasing
the ANALYZE statistics on the index columns to, say, 500?

Your disks are RAID 5. How many drives? In RAID5, more drives improves the

speed of large scans.

And what's your sort_mem setting? You didn't mention it.

Why is your effective cache size only 300mb when you have 3 GB of RAM? It's

not affecting this query, but it could affect others.

--
-Josh Berkus
Aglio Database Solutions
San Francisco

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2004-01-22 20:43:15 Re: database performance and query performance question
Previous Message Shea,Dan [CIS] 2004-01-22 20:35:36 Re: database performance and query performance question