Re: more problems with count(*) on large table

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Alban Hertroys" <a(dot)hertroys(at)magproductions(dot)nl>
Cc: "Mike Charnoky" <noky(at)nextbus(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "A(dot) Kretschmer" <andreas(dot)kretschmer(at)schollglas(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: more problems with count(*) on large table
Date: 2007-10-01 10:04:29
Message-ID: 87ve9rjglu.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Alban Hertroys" <a(dot)hertroys(at)magproductions(dot)nl> writes:

> Mike Charnoky wrote:
>> With respect to the ALTER TABLE SET STATISTICS... how do I determine a
>> good value to use? This wasn't really clear in the pg docs. Also, do I
>> need to run ANALYZE on the table after I change the statistics?
>>
>> Here are the EXPLAINs from the queries:
>>
>> db=# explain select count(*) from prediction_accuracy where evtime
>> between '2007-09-25' and '2007-09-26';
>>
>> QUERY PLAN
>>
>> ---------------------------------------------------------------------------------------------------------------------------------------------------------
>> Aggregate (cost=475677.40..475677.41 rows=1 width=0)
>> -> Index Scan using pred_acc_evtime_index on prediction_accuracy
>> (cost=0.00..444451.44 rows=12490383 width=0)
>> Index Cond: ((evtime >= '2007-09-25 00:00:00-07'::timestamp
>> with time zone) AND (evtime <= '2007-09-26 00:00:00-07'::timestamp with
>> time zone))
>> (3 rows)
>>
>> db=# explain select count(*) from prediction_accuracy where evtime
>> between '2007-09-26' and '2007-09-27';
>>
>> QUERY PLAN
>>
>> ---------------------------------------------------------------------------------------------------------------------------------------------------------
>> Aggregate (cost=486615.04..486615.05 rows=1 width=0)
>> -> Index Scan using pred_acc_evtime_index on prediction_accuracy
>> (cost=0.00..454671.07 rows=12777586 width=0)
>> Index Cond: ((evtime >= '2007-09-26 00:00:00-07'::timestamp
>> with time zone) AND (evtime <= '2007-09-27 00:00:00-07'::timestamp with
>> time zone))
>> (3 rows)
>
> Interesting, same plans and no sequential scans... Yet totally different
> run times. Almost as if something prevents you to read some records
> between 26 and 27 september...

Just to be sure we're looking at the right plan do this:

\timing
PREPARE stmt AS
SELECT count(*)
FROM prediction_accuracy
WHERE evtime BETWEEN '2007-09-25' AND '2007-09-26';

EXPLAIN EXECUTE stmt;
EXECUTE stmt;

> I'm no expert on locking in Postgres, but AFAIK locks that prevent you
> from reading records are rather rare and probably only issued from
> userland code.

Pages can be locked but never for very long.

What other work is going on in this server? Is there anything which might be
locking the table periodically (either a VACUUM FULL, CLUSTER, ALTER TABLE,
etc?)

Perhaps there's a hardware problem, is there anything in your system logs from
dmesg?

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Albe Laurenz 2007-10-01 10:05:18 Re: more problems with count(*) on large table
Previous Message Anoo Sivadasan Pillai 2007-10-01 10:02:20 ERROR: invalid byte sequence from psql - Works perfectly from pgAdmin III query window