Pgstat.tmp file activity

From: "Dan Austin" <DAustin(at)au(dot)westfield(dot)com>
To: <pgsql-admin(at)postgresql(dot)org>
Subject: Pgstat.tmp file activity
Date: 2005-12-22 07:24:31
Message-ID: ED242F3A9037824A952B71B480A5D7ED6DFAA4@ausyd-exmb01.au.ad.westfield.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi,

Have a major reliability problem with out postgres installation here.
Basically the backends seem to "hang" when the server is under
non-trivial load. It can cope with ~200 concurrent connections (max is
1024) but anything over that and all the backends start to hang. By hang
I mean they answer connection attempts and accept a SQL statement but
take a LONG time to reply.

During this time if I do a "df" on the server the df command will block
as well when it comes to list the partition that postgres's database is
stored on. The box does not appear to be CPU bound and there is little
or no ioswap activity. There is however a LOT of disk writing activity
on the postgres partition.

Vitals:
* Solaris 10 (SPARC)
* Postgres 8.0.2

Solaris 10 has a tool called dtrace which reveals that pid #687 (a
postgres backend) is doing a LOT of writing to a single file (iosnoop.d
for you dtrace hackers):

UID PID D BLOCK SIZE COMM PATHNAME
103 687 W 120384 98304 postgres
/export/data/pgsql/data/global/pgstat.tmp.687
103 687 W 120384 98304 postgres
/export/data/pgsql/data/global/pgstat.tmp.687
103 687 W 120384 98304 postgres
/export/data/pgsql/data/global/pgstat.tmp.687
103 687 W 4521728 65536 postgres
/export/data/pgsql/data/global/pgstat.tmp.687
103 687 W 4521728 65536 postgres
/export/data/pgsql/data/global/pgstat.tmp.687
103 687 W 4521728 65536 postgres
/export/data/pgsql/data/global/pgstat.tmp.687
103 687 W 4521904 131072 postgres
/export/data/pgsql/data/global/pgstat.tmp.687
103 687 W 4522160 81920 postgres
/export/data/pgsql/data/global/pgstat.tmp.687
103 687 W 4521904 131072 postgres
/export/data/pgsql/data/global/pgstat.tmp.687
103 687 W 4522160 81920 postgres
/export/data/pgsql/data/global/pgstat.tmp.687
103 687 W 4521904 212992 postgres
/export/data/pgsql/data/global/pgstat.tmp.687
[...]

# ps -fp 687
UID PID PPID C STIME TTY TIME CMD
pgsql 687 686 1 Dec 20 ? 57:09
/usr/local/pgsql/bin/postmaster -D /export/data/pgsql/data

(Hmmm... Interesting. The PID of the process doing all that writing is
always "+1" of its parent PID. I take it that the first child spawned by
postmaster is responsible for updating the pg_stats_activity table and
others?)

This database serves a website and most of its queries (almost all) will
be read-only. So it's not clear why so much disk writing is going on. So
we started to look at exactly what was doing the writing and that's when
we found this file. We have postgres running on two other boxes, both
Solaris, same version of Postgres, and they show similar activity.

We think that maybe Postgres is saturating the disk and this is causing
the "blocking". Mind you the iostat.d program that produced the output
above was running at a time when there was low load and things were
fine.

So my questions are:

1. What is the pgstat.tmp file for and why does a single PID on each
server write almost continuously to it?

2. Why does that file not really exist in the file system according to
"ls"?

3. Where should I look to determine why Postgres is writing so much to
that disk?

Dan

--
Daniel Austin (+61 2 9358 7983)
Internet Technology Manager

--
Confidential: This email is intended only for the person or entity to
which it is addressed. The information contained in this email may be
confidential and privileged. Any retransmission, review, dissemination or
other use of, or taking of any action in reliance upon, this information
by persons or entities other than the intended recipient is prohibited.
If you received this email in error, please let us know by reply and
destroy all copies of it. Thank you.

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Richard Huxton 2005-12-22 07:27:23 Re: [GENERAL] PostgreSQL crashing
Previous Message Michael Fuhr 2005-12-22 06:31:10 Re: How to get OID return value(pls help me)