Skip site navigation (1) Skip section navigation (2)

Help w/speeding up range queries?

From: John Major <major(at)cbio(dot)mskcc(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Help w/speeding up range queries?
Date: 2006-10-31 23:18:38
Message-ID: 4547D9CE.2040705@cbio.mskcc.org (view raw or flat)
Thread:
Lists: pgsql-performance
Hello-

#I am a biologist, and work with large datasets (tables with millions of 
rows are common).
#These datasets often can be simplified as features with a name, and a 
start and end position (ie:  a range along a number line.  GeneX is on 
some chromosome from position 10->40)

I store  these features in tables that generally have the form:

SIMPLE_TABLE:
FeatureID(PrimaryKey) -- FeatureName(varchar) -- 
FeatureChromosomeName(varchar) -- StartPosition(int) -- EndPosition(int)

My problem is, I often need to execute searches of tables like these 
which find "All features within a range". 
Ie:  select FeatureID from SIMPLE_TABLE where FeatureChromosomeName like 
'chrX' and StartPosition > 1000500 and EndPosition < 2000000;

This kind of query is VERY slow, and I've tried tinkering with indexes 
to speed it up, but with little success.
Indexes on Chromosome help a little, but it I can't think of a way to 
avoid full table scans for each of the position range queries.

Any advice on how I might be able to improve this situation would be 
very helpful.

Thanks!
John

Responses

pgsql-performance by date

Next:From: Luke LonerganDate: 2006-10-31 23:54:50
Subject: Re: Help w/speeding up range queries?
Previous:From: Alvaro HerreraDate: 2006-10-31 22:36:32
Subject: Re: MVCC & indexes?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group