| From: | "Jeffrey W(dot) Baker" <jwbaker(at)acm(dot)org> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | pgsql-performance(at)postgresql(dot)org | 
| Subject: | Re: Huge Data sets, simple queries | 
| Date: | 2006-01-28 17:08:53 | 
| Message-ID: | 1138468133.9336.5.camel@noodles | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-performance | 
On Sat, 2006-01-28 at 10:55 -0500, Tom Lane wrote:
> 
> Assuming that "month" means what it sounds like, the above would
> result
> in running twelve parallel sort/uniq operations, one for each month
> grouping, to eliminate duplicates before counting.  You've got sortmem
> set high enough to blow out RAM in that scenario ...
Hrmm, why is it that with a similar query I get a far simpler plan than
you describe, and relatively snappy runtime?
  select date
       , count(1) as nads
       , sum(case when premium then 1 else 0 end) as npremium
       , count(distinct(keyword)) as nwords
       , count(distinct(advertiser)) as nadvertisers 
    from data 
group by date 
order by date asc
                                          QUERY PLAN                                           
-----------------------------------------------------------------------------------------------
 GroupAggregate  (cost=0.00..14452743.09 rows=721 width=13)
   ->  Index Scan using data_date_idx on data  (cost=0.00..9075144.27 rows=430206752 width=13)
(2 rows)
=# show server_version;
 server_version 
----------------
 8.1.2
(1 row)
-jwb
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2006-01-28 17:37:00 | Re: Huge Data sets, simple queries | 
| Previous Message | Tom Lane | 2006-01-28 15:55:02 | Re: Huge Data sets, simple queries |