Quick Links

Performance indexing of a simple query

From:	Mark Fox <mark(dot)fox(at)gmail(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Performance indexing of a simple query
Date:	2005-08-24 20:43:51
Message-ID:	bca49a9f05082413435f737033@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

I have a table called 'jobs' with several million rows, and the only
columns that are important to this discussion are 'start_time' and
'completion_time'.

The sort of queries I want to execute (among others) are like:

SELECT * FROM jobs
WHERE completion_time > SOMEDATE AND start_time < SOMEDATE;

In plain english: All the jobs that were running at SOMEDATE. The
result of the query is on the order of 500 rows.

I've got seperate indexes on 'start_time' and 'completion_time'.

Now, if SOMEDATE is such that the number of rows with completion_time
> SOMEDATE is small (say 10s of thousands), the query uses index scans
and executes quickly. If not, the query uses sequential scans and is
unacceptably slow (a couple of minutes). I've used EXPLAIN and
EXPLAIN ANALYZE to confirm this. This makes perfect sense to me.

I've played with some of the memory settings for PostgreSQL, but none
has had a significant impact.

Any ideas on how to structure the query or add/change indexes in such
a way to improve its performance? In desperation, I tried using a
subquery, but unsurprisingly it made no (positive) difference. I feel
like there might be a way of using an index on both 'completion_time'
and 'start_time', but can't put a temporal lobe on the details.

Mark

Responses

Re: Performance indexing of a simple query at 2005-08-24 21:22:34 from Jim C. Nasby
Re: Performance indexing of a simple query at 2005-08-24 23:42:00 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Jim C. Nasby	2005-08-24 20:57:44	Re: Read/Write block sizes
Previous Message	Chris Hoover	2005-08-24 20:26:40	Re: Some ideas for comment