Quick Links

Re: Huge Data sets, simple queries

From:	"Jeffrey W(dot) Baker" <jwbaker(at)acm(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Huge Data sets, simple queries
Date:	2006-01-28 17:08:53
Message-ID:	1138468133.9336.5.camel@noodles
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Sat, 2006-01-28 at 10:55 -0500, Tom Lane wrote:
>
> Assuming that "month" means what it sounds like, the above would
> result
> in running twelve parallel sort/uniq operations, one for each month
> grouping, to eliminate duplicates before counting. You've got sortmem
> set high enough to blow out RAM in that scenario ...

Hrmm, why is it that with a similar query I get a far simpler plan than
you describe, and relatively snappy runtime?

select date
, count(1) as nads
, sum(case when premium then 1 else 0 end) as npremium
, count(distinct(keyword)) as nwords
, count(distinct(advertiser)) as nadvertisers
from data
group by date
order by date asc

QUERY PLAN
-----------------------------------------------------------------------------------------------
GroupAggregate (cost=0.00..14452743.09 rows=721 width=13)
-> Index Scan using data_date_idx on data (cost=0.00..9075144.27 rows=430206752 width=13)
(2 rows)

=# show server_version;
server_version
----------------
8.1.2
(1 row)

-jwb

In response to

Re: Huge Data sets, simple queries at 2006-01-28 15:55:02 from Tom Lane

Responses

Re: Huge Data sets, simple queries at 2006-01-28 17:37:00 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Tom Lane	2006-01-28 17:37:00	Re: Huge Data sets, simple queries
Previous Message	Tom Lane	2006-01-28 15:55:02	Re: Huge Data sets, simple queries