| From: | Jeff Boes <jboes(at)nexcerpt(dot)com> | 
|---|---|
| To: | pgsql-sql(at)postgresql(dot)org | 
| Subject: | SQL challenge--top 10 for each key value? | 
| Date: | 2004-04-08 19:55:33 | 
| Message-ID: | 33af4fa4e4ec230574cefcff737ea40a@news.teranews.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-sql | 
Offered up for anyone with time on their hands. I fiddled around with 
this for half an afternoon, then gave up and did it programmatically in 
Perl.
Given a table that looks something like this:
id       | INTEGER
query    | INTEGER
checksum | char(32)
score    | INTEGER
include  | BOOLEAN
The table is unique by "id". "Checksum" may be repeated, but I only care 
if it is repeated within a given group by "query". ("query" is non-null.)
I can get the top scorer for each "query" row by something like this:
SELECT * FROM (
   SELECT DISTINCT ON (checksum) *
   FROM my_table
   ORDER BY checksum, score DESC)
ORDER BY query;
How would you go about getting the top N (say, the top 10) for each query?
And then, if that's too easy for you--consider a further case where I 
want every row for a given "query" that has "include" TRUE, and enough 
non-"include" rows to make N. I might end up with more than N rows for a 
given value of "query" if there were more than N with "include" set.
I headed off in the direction of groups of SELECTs and UNIONs, and quit 
when I got to something like four levels of "SELECT ... AS FOO" ...
-- 
Jeff Boes                                      vox 269.226.9550 ext 24
Database Engineer                                     fax 269.349.9076
Nexcerpt, Inc.                                 http://www.nexcerpt.com
            ...Nexcerpt... Extend your Expertise
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Greg Stark | 2004-04-08 23:33:01 | Re: SQL challenge--top 10 for each key value? | 
| Previous Message | Richard Huxton | 2004-04-08 11:00:16 | Re: Encoding and result string length |