Quick Links

Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

From:	Richard Huxton <dev(at)archonet(dot)com>
To:	Tory M Blue <tmblue(at)gmail(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL
Date:	2010-01-22 09:42:49
Message-ID:	4B597319.8010504@archonet.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On 21/01/10 22:15, Tory M Blue wrote:
> · Data distribution = In the 98mill records, there are 7000 unique
> makeid's, and 21mill unique UID's. About 41mill of the records have
> tagged=true
>
> · Time to execute the following query with indices on makeid and
> tagged = 90-120 seconds. The planner uses the webid index and filters on
> tagged and then rechecks the webid index
>
> * SELECT COUNT(DISTINCT uid ) AS active_users FROM
> pixelpool.userstats WHERE makeid ='bmw-ferman' AND tagged =true*
>
> · Time to execute the the same query with a combined index on makeid
> and tagged = 60-100 seconds. The planner uses the combined index and then
> filters tagged.

Two things:

1. You have got the combined index on (makeid, tagged) and not (tagged,
makeid) haven't you? Just checking.

2. If it's mostly tagged=true you are interested in you can always use a
partial index: CREATE INDEX ... (makeid) WHERE tagged
This might be a win even if you need a second index with WHERE NOT tagged.

Also, either I've not had enough cofee yet, or a bitmap scan is an odd
choice for only ~ 13000 rows out of 100 million.

> * " -> Bitmap Index Scan on
> idx_retargetuserstats_makeidtag (cost=0.00..360.20 rows=13175 width=0)"*
>
> * " Index Cond: ((makeid = 'b1mw-ferman'::text)
> AND (tagged = true))"*

Otherwise, see what Craig said.

I'm assuming this isn't the query that is CPU bound for a long time.
Unless your table is horribly bloated, there's no reason for that
judging by this plan.

--
Richard Huxton
Archonet Ltd

In response to

Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL at 2010-01-21 22:15:29 from Tory M Blue

Responses

Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL at 2010-01-22 18:03:35 from Tory M Blue

Browse pgsql-performance by date

	From	Date	Subject
Next Message	elias ghanem	2010-01-22 12:42:39	Slow update query
Previous Message	DM	2010-01-22 08:11:37	Fragmentation/Vacuum, Analyze, Re-Index