| From: | "Mischa Sandberg" <mischa_sandberg(at)telus(dot)net> | 
|---|---|
| To: | pgsql-performance(at)postgresql(dot)org | 
| Subject: | Range query optimization | 
| Date: | 2004-06-24 21:24:58 | 
| Message-ID: | KEHCc.11583$HS3.659@edtnps84 | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-performance | 
I'm trying to make a (qua-technical, qua-business) case for switching from
MS SQL, and one of the types of query that really doesn't sit well with MS
SQL2K is:
-- All fields integers or equivalent.
-- Table T(k, x: nonkey fields...)
-- Table U(k, a, z: m)    -- for each value of (k) a set of non-intersecting
ranges [a,z) that map to (m) values.
select T.*, U.m from T join U on T.k=U.k and T.x >= U.a and T.x < U.z
Typically there are are about 1000-2000 U rows per value of (k), about 100K
values of (k) and about 50M
values of T.
By itself, this type of query grinds the CPU to dust. A clustered index on
fields of U (take your pick) barely halves the problem of the loop through
1000-2000 rows of U for each row of T.  Hash join likewise.
The current workaround is a 'manual'  radix index on top of the range table,
but it's something of a hack.
Would the geometric of extensions handle such queries efficiently? I'm not
familiar with applying R-trees to linear range problems.
----
"Dreams come true, not free." -- S.Sondheim
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Chris Cheston | 2004-06-25 06:59:32 | postgres 7.4 at 100% | 
| Previous Message | Shea,Dan [CIS] | 2004-06-24 17:17:13 | Re: after using pg_resetxlog, db lost |