From: | David Garamond <lists(at)zara(dot)6(dot)isreserved(dot)com> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | determining how many products are in how many categories |
Date: | 2004-02-08 17:09:53 |
Message-ID: | 40266D61.3070304@zara.6.isreserved.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
# product table (simplified):
create table p (
id char(22) not null primary key,
name text,
desc text
);
# product category table (simpl.):
create table pc (
id char(22) not null primary key,
name text,
desc text
);
# table that maps products into categories:
create table p_pc (
id char(22) not null primary key,
pid char(22) not null references p(id),
pcid char(22) not null references pc(id)
);
create index i_ppc_pid on p_pc(pid);
create index i_ppc_pcid on p_pc(pcid);
create unique index i_ppc_pid_pcid on p_pc(pid, pcid);
There are about 50k rows in p, 40k in pc, and 140k in p_pc. Most
products are only placed in 1 category, some in 2, fewer in 3, and fewer
stills in 4, 5, 6 categories.
I want to know how many products are placed in 1 category, how many in
2, and so on. Here's my query:
select count(pid),num_cat from (
select pid,count(cid) as num_cat
from ppc group by pid) as f
group by num_cat;
A sample output (4000 products are categorized in 5 different places,
4998 in 4, and so on):
count | num_cat
-------+---------
4000 | 5
4998 | 4
7502 | 3
10001 | 2
17499 | 1
(5 rows)
However, this query is taking about 2.5 minutes. Any idea on how to make
it faster?
--
dave
From | Date | Subject | |
---|---|---|---|
Next Message | Philippe Lang | 2004-02-09 07:46:40 | Disable/Enable Trigger? |
Previous Message | Tom Lane | 2004-02-07 20:53:55 | Re: Storing a range of numbers |