Really really slow select count(*)

From: felix <crucialfelix(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Really really slow select count(*)
Date: 2011-02-04 14:46:35
Message-ID: AANLkTinAJcm5A+i1O_R87aiYyu8N2OWgoaL9RfR=tUR9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I am having huge performance problems with a table. Performance deteriorates
every day and I have to run REINDEX and ANALYZE on it every day. auto
vacuum is on. yes, I am reading the other thread about count(*) :)

but obviously I'm doing something wrong here

explain analyze select count(*) from fastadder_fastadderstatus;

Aggregate (cost=62458.73..62458.74 rows=1 width=0) (actual
time=77130.000..77130.000 rows=1 loops=1)
-> Seq Scan on fastadder_fastadderstatus (cost=0.00..61701.18
rows=303018 width=0) (actual time=50.000..76930.000 rows=302479 loops=1)
Total runtime: *77250.000 ms*

directly after REINDEX and ANALYZE:

Aggregate (cost=62348.70..62348.71 rows=1 width=0) (actual
time=15830.000..15830.000 rows=1 loops=1)
-> Seq Scan on fastadder_fastadderstatus (cost=0.00..61613.16
rows=294216 width=0) (actual time=30.000..15570.000 rows=302479 loops=1)
Total runtime: 15830.000 ms

still very bad for a 300k row table

a similar table:

explain analyze select count(*) from fastadder_fastadderstatuslog;

Aggregate (cost=8332.53..8332.54 rows=1 width=0) (actual
time=1270.000..1270.000 rows=1 loops=1)
-> Seq Scan on fastadder_fastadderstatuslog (cost=0.00..7389.02
rows=377402 width=0) (actual time=0.000..910.000 rows=377033 loops=1)
Total runtime: 1270.000 ms

It gets updated quite a bit each day, and this is perhaps the problem.
To me it doesn't seem like that many updates

100-500 rows inserted per day
no deletes

10k-50k updates per day
mostly of this sort: set priority=1 where id=12345

is it perhaps this that is causing the performance problem ?

I could rework the app to be more efficient and do updates using batches
where id IN (1,2,3,4...)

I assume that means a more efficient index update compared to individual
updates.

There is one routine that updates position_in_queue using a lot (too many)
update statements.
Is that likely to be the culprit ?

*What else can I do to investigate ?*

Table
"public.fastadder_fastadderstatus"
Column | Type |
Modifiers
-------------------+--------------------------+------------------------------------------------------------------------
id | integer | not null default
nextval('fastadder_fastadderstatus_id_seq'::regclass)
apt_id | integer | not null
service_id | integer | not null
agent_priority | integer | not null
priority | integer | not null
last_validated | timestamp with time zone |
last_sent | timestamp with time zone |
last_checked | timestamp with time zone |
last_modified | timestamp with time zone | not null
running_status | integer |
validation_status | integer |
position_in_queue | integer |
sent | boolean | not null default false
built | boolean | not null default false
webid_suffix | integer |
build_cache | text |
Indexes:
"fastadder_fastadderstatus_pkey" PRIMARY KEY, btree (id)
"fastadder_fastadderstatus_apt_id_key" UNIQUE, btree (apt_id,
service_id)
"fastadder_fastadderstatus_agent_priority" btree (agent_priority)
"fastadder_fastadderstatus_apt_id" btree (apt_id)
"fastadder_fastadderstatus_built" btree (built)
"fastadder_fastadderstatus_last_checked" btree (last_checked)
"fastadder_fastadderstatus_last_validated" btree (last_validated)
"fastadder_fastadderstatus_position_in_queue" btree (position_in_queue)
"fastadder_fastadderstatus_priority" btree (priority)
"fastadder_fastadderstatus_running_status" btree (running_status)
"fastadder_fastadderstatus_service_id" btree (service_id)
Foreign-key constraints:
"fastadder_fastadderstatus_apt_id_fkey" FOREIGN KEY (apt_id) REFERENCES
nsproperties_apt(id) DEFERRABLE INITIALLY DEFERRED
"fastadder_fastadderstatus_service_id_fkey" FOREIGN KEY (service_id)
REFERENCES fastadder_fastadderservice(id) DEFERRABLE INITIALLY DEFERRED

thanks !

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message hubert depesz lubaczewski 2011-02-04 14:49:51 Re: Really really slow select count(*)
Previous Message Greg Smith 2011-02-04 14:44:58 Re: Query performance with disabled hashjoin and mergejoin