Quota query with decent performance?

From: Troels Arvin <troels(at)arvin(dot)dk>
To: pgsql-sql(at)postgresql(dot)org
Subject: Quota query with decent performance?
Date: 2003-11-11 22:58:09
Message-ID: pan.2003.11.11.22.58.08.758402@arvin.dk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hello,

I'm researching how "quota queries" (a term used by Fabian Pascal) may be
performed in various DBMSes with acceptable performance:
http://troels.arvin.dk/db/rdbms/#select-limit-simple-note

An example of a quota query could be to get the top-3 youngest people from
a collection of people. The complicated part is that such a query might
return more than 3 rows in some tie situations.

In MSSQL and DB2 there are very efficient facilities for such queries, but
I can't find any well-performing declarative methods for PostgreSQL. I
have tried a couple of different strategies, and I currently get the best
results from a correlated subquery like

SELECT * FROM person AS px
WHERE (
SELECT COUNT(*)
FROM person AS py
WHERE py.age < px.age
) < 3;

When my base table has 4000 rows, my query takes 27 seconds in PostgreSQL
7.2.3 (PIII 1000MHz) which is clearly unacceptable, especially comparing
to the same query in DB2 which only takes 1.4 seconds (on the same server)
- or to this non-standard-SQL DB2-query which only takes 0.02 seconds to
calculate the same result:

SELECT *
FROM (
SELECT firstname,age,RANK() OVER (ORDER BY age ASC) AS rank
FROM person
) AS foo
WHERE rank<=3;

Test-files with table definitions and randomly generated rows:
http://troels.arvin.dk/db/tests/quota.1/

Any suggestions on how to perform fast "quota queries" in PostgreSQL?

--
Greetings from Troels Arvin, Copenhagen, Denmark

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Josh Berkus 2003-11-11 23:41:34 Re: Quota query with decent performance?
Previous Message Nick Fankhauser 2003-11-11 21:23:02 Re: Is there a more elegant way to write this query?...