Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Tory M Blue <tmblue(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL
Date: 2010-01-23 02:25:17
Message-ID: 4B5A5E0D.4000003@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 23/01/2010 1:59 AM, Tory M Blue wrote:

> It's this query, run 6000 times with a diff makeid's /
> /
>
> /SELECT COUNT(DISTINCT uid ) AS active_users FROM
> pixelpool.userstats WHERE makeid ='bmw-ferman' AND tagged =true/
>
> / Plan/
>
> / "Aggregate (cost=49467.00..49467.01 rows=1 width=8)"/
>
> / " -> Bitmap Heap Scan on userstats (cost=363.49..49434.06
> rows=13175 width=8)"/
>
> / " Recheck Cond: (makeid = 'b1mw-ferman'::text)"/
>
> / " Filter: tagged"/
>
> / " -> Bitmap Index Scan on idx_retargetuserstats_makeidtag
> (cost=0.00..360.20 rows=13175 width=0)"/
>
> / " Index Cond: ((makeid = 'b1mw-ferman'::text) AND (tagged
> = true))"/

Try:

- Adding a partial index on makeid, eg:

CREATE INDEX userstats_makeid_where_tagged_idx
ON userstats (makeid) WHERE (tagged);

- Instead of repeating the query 6000 times in a loop, collect the data
in one pass by joining against a temp table containing the makeids of
interest.

SELECT COUNT(DISTINCT u.uid) AS active_users
FROM pixelpool.userstats u
INNER JOIN temp_makeids m ON (u.makeid = m.makeid)
WHERE u.tagged = true;

(If the 6000 repeats are really a correlated subquery part of a bigger
query you still haven't shown, then you might be able to avoid 6000
individual passes by adjusting your outer query instead).

--
Craig Ringer

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Carey 2010-01-23 02:51:24 Re: Inserting 8MB bytea: just 25% of disk perf used?
Previous Message fkater@googlemail.com 2010-01-22 20:42:03 Re: Inserting 8MB bytea: just 25% of disk perf used?