Skip site navigation (1) Skip section navigation (2)

Re: Slow query with a lot of data

From: Moritz Onken <onken(at)houseofdesign(dot)de>
To:
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow query with a lot of data
Date: 2008-08-21 07:45:32
Message-ID: DDB96E10-947A-4AA8-90D4-770567E1F236@houseofdesign.de (view raw or flat)
Thread:
Lists: pgsql-performance
Am 21.08.2008 um 09:04 schrieb Moritz Onken:

>
> Am 20.08.2008 um 20:28 schrieb Tom Lane:
>
>> "Scott Carey" <scott(at)richrelevance(dot)com> writes:
>>> The planner actually thinks there will only be 28704 rows returned  
>>> of width
>>> 12.  But it chooses to sort 53 million rows before aggregating.   
>>> Thats
>>> either a bug or there's something else wrong here.   That is the  
>>> wrong way
>>> to aggregate those results no matter how much work_mem you have  
>>> unless I'm
>>> completely missing something...
>>
>> That does look weird.  What are the datatypes of the columns being
>> grouped by?  Maybe they're not hashable?
>>
>> Another forcing function that prevents use of HashAgg is DISTINCT
>> aggregates, but you don't seem to have any in this query...
>>
>> 			regards, tom lane
>
> The datatypes are both integers. There is no DISTINCT in this query.
> Thanks anyway!
>

insert into setup1 (select
   a."user", b.category, sum(1.0/b.cat_count)::float
   from result a, domain_categories b
   where a."domain" = b."domain"
   and b.depth < 4
   and a.results > 100
   group by a."user", b.category);

This query inserted a total of 16,000,000 rows and, with work_mem set  
to 3000mb,
took about 24 hours.

Any more ideas to speed this up?



In response to

pgsql-performance by date

Next:From: Christiaan WillemsenDate: 2008-08-21 08:34:19
Subject: Re: How to setup disk spindles for best performance
Previous:From: Moritz OnkenDate: 2008-08-21 07:04:38
Subject: Re: Slow query with a lot of data

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group