Re: more problems with count(*) on large table

From: Tomasz Ostrowski <tometzky(at)batory(dot)org(dot)pl>
To: Mike Charnoky <noky(at)nextbus(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: more problems with count(*) on large table
Date: 2007-10-02 07:42:36
Message-ID: 20071002074236.GB14546@batory.org.pl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, 01 Oct 2007, Mike Charnoky wrote:

> I altered the table in question, with "set statistics 100" on the
> timestamp column, then ran analyze. This seemed to help somewhat. Now,
> queries don't seem to hang, but it still takes a long time to do the count:
> * "where evtime between '2007-09-26' and '2007-09-27'"
> took 37 minutes to run (result was ~12 million)
> * "where evtime between '2007-09-25' and '2007-09-26'"
> took 40 minutes to run (result was ~14 million)

Maybe it needs several million scattered seeks which basically
disable disk cache.

If you can afford a database inactivity period you can for example do
periodically (for example on nights or weekends):
cluster pred_acc_evtime_index on prediction_accuracy;
(see http://www.postgresql.org/docs/8.2/static/sql-cluster.html)
This would run rather long time for the first time, but will be much
faster later.

This should make seeks much more local and it would help operating
system to cache results.

Regards
Tometzky
--
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
Winnie the Pooh

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Martijn van Oosterhout 2007-10-02 07:47:19 Re: more problems with count(*) on large table
Previous Message Martijn van Oosterhout 2007-10-02 07:30:25 Re: how to ignore invalid byte sequence for encoding without using sql_ascii?