Re: Help speeding up a left join aggregate

From: Alban Hertroys <haramrae(at)gmail(dot)com>
To: Nick <nboutelier(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Help speeding up a left join aggregate
Date: 2012-01-31 22:51:33
Message-ID: 8CA1BA7B-515B-4821-A409-362ADA3C72AD@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On 31 Jan 2012, at 4:55, Nick wrote:

> I have a pretty well tuned setup, with appropriate indexes and 16GB of
> available RAM. Should this be taking this long? I forced it to not use
> a sequential scan and that only knocked a second off the plan.
>
> QUERY
> PLAN
> ------------------------------------------------------------------------------------------------------------------------------------------
> Hash Right Join (cost=105882.35..105882.47 rows=3 width=118) (actual
> time=3931.567..3931.583 rows=4 loops=1)
> Hash Cond: (songs_downloaded.advertisement_id = a.id)
> -> HashAggregate (cost=105881.21..105881.26 rows=4 width=13)
> (actual time=3931.484..3931.489 rows=3 loops=1)
> -> Seq Scan on songs_downloaded (cost=0.00..95455.96
> rows=1042525 width=13) (actual time=0.071..1833.680 rows=1034752
> loops=1)
> Filter: (advertiser_id = 6553406)
> -> Hash (cost=1.10..1.10 rows=3 width=46) (actual
> time=0.050..0.050 rows=4 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 1kB
> -> Seq Scan on advertisements a (cost=0.00..1.10 rows=3
> width=46) (actual time=0.037..0.041 rows=4 loops=1)
> Filter: (advertiser_id = 6553406)
> Total runtime: 3931.808 ms
> (10 rows)

I bet the group by query would be far more selective on advertisement_id than on the advertiser_id it's currently using, wouldn't it?
Perhaps the query planner chooses the wrong filter here because the advertiser_id is in the inner query, while the advertisement_id is outside it. You could try and see what happens if you move the advertiser_id into the join condition:

SELECT a.id, sd.price, COALESCE(sd.downloads,0) AS downloads,
COALESCE(sd.download_revenue,0) AS download_revenue
FROM advertisements a
LEFT JOIN (SELECT advertisement_id, AVG(price) AS price, SUM(price) AS
download_revenue, COUNT(1) AS downloads FROM songs_downloaded
GROUP BY advertisement_id) AS sd ON a.id =
sd.advertisement_id AND a.advertiser_id = sd.advertiser_id
WHERE advertiser_id = 6553406

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2012-01-31 23:07:51 Re: [HACKERS] pg_dump -s dumps data?!
Previous Message Tom Lane 2012-01-31 22:48:06 Re: [GENERAL] pg_dump -s dumps data?!