Re: Large pgstat.stat file causes I/O storm

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Cristian Gafton <gafton(at)rpath(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Large pgstat.stat file causes I/O storm
Date: 2008-01-29 20:12:41
Message-ID: 25005.1201637561@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Cristian Gafton <gafton(at)rpath(dot)com> writes:
> We are churning through a bunch of short-lived temp tables.

I think that's probably the root of the problem ...

> Since I
> reported the problem, the pgstat file is now sitting at 85M, yet the
> pg_stat* tables barely have any entries in them:

> count(*)
> pg_stats 298
> pg_statistic 298
> pg_stat_all_indexes 76
> pg_stat_all_tables 76
> pg_statio_all_tables 56
> pg_statio_all_indexes 76

Those views are joins against pg_class, so only tables that have live
pg_class rows can possibly show up there. You could try remembering the
OIDs of some temp tables and probing the underlying pg_stat_get_xxx()
functions to see if there are stats-table entries for them.

(Pokes around in the code...) I think the problem here is that the only
active mechanism for flushing dead stats-table entries is
pgstat_vacuum_tabstat(), which is invoked by a VACUUM command or an
autovacuum. Once-a-day VACUUM isn't gonna cut it for you under those
circumstances. What you might do is just issue a VACUUM on some
otherwise-uninteresting small table, once an hour or however often you
need to keep the stats file bloat to a reasonable level.

There is a pgstat_drop_relation() function to tell the stats collector
to drop a single table entry, but it's not being called from anyplace.
We probably ought to try a bit harder to make that work. The problem
is described here:

2007-07-08 18:23 tgl

* src/: backend/postmaster/pgstat.c, backend/storage/smgr/smgr.c,
include/pgstat.h (REL8_1_STABLE), backend/postmaster/pgstat.c,
backend/storage/smgr/smgr.c, include/pgstat.h (REL8_2_STABLE),
backend/postmaster/pgstat.c, backend/storage/smgr/smgr.c,
include/pgstat.h: Remove the pgstat_drop_relation() call from
smgr_internal_unlink(), because we don't know at that point which
relation OID to tell pgstat to forget. The code was passing the
relfilenode, which is incorrect, and could possibly cause some
other relation's stats to be zeroed out. While we could try to
clean this up, it seems much simpler and more reliable to let the
next invocation of pgstat_vacuum_tabstat() fix things; which indeed
is how it worked before I introduced the buggy code into 8.1.3 and
later :-(. Problem noticed by Itagaki Takahiro, fix is per
subsequent discussion.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2008-01-29 20:35:36 Re: [PATCHES] Proposed patch: synchronized_scanning GUCvariable
Previous Message Cristian Gafton 2008-01-29 19:45:38 Re: Large pgstat.stat file causes I/O storm