Skip site navigation (1) Skip section navigation (2)

Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

From: Richard Huxton <dev(at)archonet(dot)com>
To: Tory M Blue <tmblue(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL
Date: 2010-01-22 09:42:49
Message-ID: 4B597319.8010504@archonet.com (view raw or flat)
Thread:
Lists: pgsql-performance
On 21/01/10 22:15, Tory M Blue wrote:
> ·         Data distribution = In the 98mill records, there are 7000 unique
> makeid's, and 21mill unique UID's. About 41mill of the records have
> tagged=true
>
> ·         Time to execute the following query with indices on makeid and
> tagged = 90-120 seconds. The planner uses the webid index and filters on
> tagged and then rechecks the webid index
>
> *                SELECT COUNT(DISTINCT uid )  AS active_users FROM
> pixelpool.userstats    WHERE makeid ='bmw-ferman' AND tagged =true*
>
> ·         Time to execute the the same query with a combined index on makeid
> and tagged = 60-100 seconds. The planner uses the combined index and then
> filters tagged.

Two things:

1. You have got the combined index on (makeid, tagged) and not (tagged, 
makeid) haven't you? Just checking.

2. If it's mostly tagged=true you are interested in you can always use a 
partial index: CREATE INDEX ... (makeid) WHERE tagged
This might be a win even if you need a second index with WHERE NOT tagged.


Also, either I've not had enough cofee yet, or a bitmap scan is an odd 
choice for only ~ 13000 rows out of 100 million.

> *                "        ->   Bitmap Index Scan on
> idx_retargetuserstats_makeidtag  (cost=0.00..360.20 rows=13175 width=0)"*
>
> *                "              Index Cond: ((makeid = 'b1mw-ferman'::text)
> AND (tagged = true))"*

Otherwise, see what Craig said.

I'm assuming this isn't the query that is CPU bound for a long time. 
Unless your table is horribly bloated, there's no reason for that 
judging by this plan.

-- 
   Richard Huxton
   Archonet Ltd

In response to

Responses

pgsql-performance by date

Next:From: elias ghanemDate: 2010-01-22 12:42:39
Subject: Slow update query
Previous:From: DMDate: 2010-01-22 08:11:37
Subject: Fragmentation/Vacuum, Analyze, Re-Index

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group