Re: count(*) performance

From: "Mikael Carneholm" <Mikael(dot)Carneholm(at)WirelessCar(dot)com>
To: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
Cc: "Postgresql Performance" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: count(*) performance
Date: 2006-03-27 23:17:43
Message-ID: 7F10D26ECFA1FB458B89C5B4B0D72C2B088296@sesrv12.wirelesscar.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I think it is definitely necessary from an administration point of view - as an administrator, I want to know:

1) Are there any stats (at all) in a schema
2) Are there any stats on the table that slow_query_foo is targeting
3) If I have stats, how recent are they
4) Could it be that there are a lot of dead tuples lying around (given the amount of traffic I know I have)

These would be (are always!) the first questions I ask myself when I'm about to identify performance problems in an app, don't know how other people do though :)

Maybe something I'll try to look into this weekend, if I can spare some time.

- Mikael

-----Original Message-----
From: Matthew T. O'Connor [mailto:matthew(at)zeut(dot)net]
Sent: den 28 mars 2006 00:43
To: Mikael Carneholm
Cc: Postgresql Performance
Subject: Re: [PERFORM] count(*) performance

Mikael Carneholm wrote:
> This is where a "last_vacuumed" (and "last_analyzed") column in
> pg_statistic(?) would come in handy. Each time vacuum or analyze has
> finished, update the row for the specific table that was
> vacuumed/analyzed with a timestamp in the last_vacuumed/last_analyzed
> column. No more guessing "maybe I haven't vacuumed/analyzed in a while",
> and each time a user complains about bad performance, one could request
> the user to do a "select s.last_vacuumed, s.last_analyzed from
> pg_statistic s, pg_attribute a, pg_class c where ..."
>
> It SOUNDS easy to implement, but that has fooled me before... :-)

It is fairly easy to implement, however it has been discussed before and
decided that it wasn't necessary. What the system cares about is how
long it's been since the last vacuum in terms of XIDs not time. Storing
a timestamp would make it more human readable, but I'm not sure the
powers that be want to add two new columns to some system table to
accommodate this.

Matt

Browse pgsql-performance by date

  From Date Subject
Next Message Simon Riggs 2006-03-28 08:30:54 Re: simple join uses indexes, very slow
Previous Message Tom Lane 2006-03-27 23:13:48 Re: count(*) performance