Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL

From: Tory M Blue <tmblue(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL
Date: 2010-01-21 22:15:29
Message-ID: 8a547c841001211415i12b39fcbx1179b340046d4ee@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

The issues we are seeing besides just saying the reports take over 26 hours,
is that the query seems to be CPU bound. Meaning that the query consumes an
entire CPU and quite often it is sitting with 65%-90% WAIT. Now this is not
iowait, the disks are fine, 5000-6000tps, 700K reads etc with maybe 10-13%
iowait.

However much of the time that I see the CPU at 65%-90% Wait, there is very
little disk access, so it's not the disk subsystem (in my opinion). I've
also moved CPU's around and the sql seems to stall regardless of what CPU
the job has been provided with. Memory I pulled memory to test and again,
other than this data set consuming 10gigs of data, 700K free (will add more
memory), but since the disks are not a bottleneck and I don't appear to be
swapping, I keep coming back to the data or sql.

I'm providing the data that I think is requested when a performance issue is
observed.

There is an autovac running, the queries are run on static data so
INSERTS/UPDATES/DELETES

The query seems to have gotten slower as the data set grew.

Redhat
Postgres 8.3.4
8 cpu box
10gig of ram

Number of rows in the table= 100million

· Size of table including indices =21GB

· Time to create a combined index on 2 columns (tagged boolean ,
makeid text) = more than 1 hr 30 minutes

· Data distribution = In the 98mill records, there are 7000 unique
makeid's, and 21mill unique UID's. About 41mill of the records have
tagged=true

· Time to execute the following query with indices on makeid and
tagged = 90-120 seconds. The planner uses the webid index and filters on
tagged and then rechecks the webid index

* SELECT COUNT(DISTINCT uid ) AS active_users FROM
pixelpool.userstats WHERE makeid ='bmw-ferman' AND tagged =true*

· Time to execute the the same query with a combined index on makeid
and tagged = 60-100 seconds. The planner uses the combined index and then
filters tagged.

*SELECT COUNT(DISTINCT uid ) AS active_users FROM pixelpool.userstats
WHERE makeid ='bmw-ferman' AND tagged =true*

* Plan*

* "Aggregate (cost=49467.00..49467.01 rows=1 width=8)"*

* " -> Bitmap Heap Scan on userstats
(cost=363.49..49434.06 rows=13175 width=8)"*

* " Recheck Cond: (makeid = 'b1mw-ferman'::text)"*

* " Filter: tagged"*

* " -> Bitmap Index Scan on
idx_retargetuserstats_makeidtag (cost=0.00..360.20 rows=13175 width=0)"*

* " Index Cond: ((makeid = 'b1mw-ferman'::text)
AND (tagged = true))"*

Any assistance would be appreciated, don't worry about slapping me around I
need to figure this out. Otherwise I'm buying new hardware where it may not
be required.

Thanks

Tory

*
*

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message PG User 2010 2010-01-21 22:43:32 Re: performance question on VACUUM FULL (Postgres 8.4.2)
Previous Message Devrim GÜNDÜZ 2010-01-21 19:45:09 Re: New server to improve performance on our large and busy DB - advice?