Disconnects hanging server

From: Brian Wipf <brian(at)clickspace(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Disconnects hanging server
Date: 2007-12-03 21:16:55
Message-ID: 18B9152A-5535-4244-9F9E-B6A4BCB2D218@clickspace.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

We have a dual 3.0 GHz Intel Dual-core Xserve, running Mac OS X 10.5.1
Leopard Server and PostgreSQL 8.2.5. When we disconnect several
clients at a time (30+) in production, the CPU goes through the roof
and the server will hang for many seconds where it is completely non-
responsive. It seems the busier the server is, the longer the machine
will hang.

With an identical postgresql.conf file in the identical production
environment, our Linux 2.6.22 box running PG 8.2.5 has no problems
when disconnecting multiple clients. Also, our prior G5 Xserve running
Mac OS X Server 10.4.9 and PG 8.2.4 had no issues disconnecting
multiple clients.

Using pgbench, I have been able to duplicate the issue on another
Intel Xserve running 10.5.1 on a fresh install of PG 8.2.5. PG was
compiled 64-bit using CFLAGS='-args x86_64'. The only config option
was --enable-thread-safety.

The only modifications I have made to the postgresql.conf file are as
follows:
max_connections = 175
shared_buffers = 3GB # The max supported under 10.5.1 -- After setting
shmall, shmax accordingly
checkpoint_segments = 64

I used a scale factor of 150 when initializing a database for pgbench.
If I run `pgbench -c 150 -t 5000` and kill it (cntrl-c) shortly after
launching it, but after it completes its vacuum, there is a very minor
and brief increase in CPU usage (which I didn't notice at all btw on
the Linux box). If I let pgbench run for approximately 10 minutes and
then cntrl-c it, the CPU will max out and the machine will hang.
iostat stops reporting and top stops refreshing. This lasts for a
couple seconds, then top and iostat resume. Here is what iostat showed
when I killed pgbench after approximately 10 minutes:

postgres$ iostat -n 5 1
...
disk0 disk1 disk2 disk3
cpu load average
KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s KB/t
tps MB/s us sy id 1m 5m 15m
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 10.20
732 7.30 2 4 93 1.07 2.22 2.30
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 8.96
766 6.71 1 2 98 1.07 2.22 2.30
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 9.33
755 6.88 1 2 97 1.07 2.22 2.30
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 8.86
777 6.73 0 2 97 1.07 2.22 2.30
--> I hit ctrl-c to kill pgbench here
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 9.17
766 6.86 1 43 55 1.07 2.22 2.30
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 9.03
770 6.79 0 79 20 1.71 2.33 2.34
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 9.04
77 0.68 1 38 61 1.71 2.33 2.34
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 8.95
273 2.39 0 80 19 1.71 2.33 2.34
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 15.03
240 3.53 1 99 1 1.71 2.33 2.34
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 9.69
365 3.45 1 99 0 4.05 2.80 2.51
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00
0 0.00 0 100 0 4.05 2.80 2.51
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00
0 0.00 0 100 0 4.05 2.80 2.51
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 8.50
16 0.13 0 100 0 8.85 3.82 2.87
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 10.18
17 0.17 0 100 0 8.85 3.82 2.87
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 9.00
75 0.66 0 100 0 8.85 3.82 2.87
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 8.50
16 0.13 0 100 0 12.39 4.64 3.16
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 9.10
68 0.60 0 100 0 14.20 5.14 3.35
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 8.75
75 0.64 0 100 0 14.20 5.14 3.35
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 8.83
249 2.14 0 100 0 14.20 5.14 3.35
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 8.57
14 0.12 0 100 0 15.46 5.55 3.50
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 9.33
265 2.41 1 99 0 15.46 5.55 3.50
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 9.15
361 3.22 0 100 0 15.46 5.55 3.50
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 8.33
40 0.32 1 99 0 15.46 5.55 3.50
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 8.93
843 7.36 0 100 0 17.43 6.12 3.72
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 8.84
560 4.84 0 100 0 17.43 6.12 3.72
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 9.06
428 3.79 1 99 0 17.43 6.12 3.72
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 8.00
12 0.10 0 100 0 17.43 6.12 3.72
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 8.92
243 2.12 0 91 9 17.43 6.12 3.72
--> unit recovered here:
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 13.11
628 8.03 0 2 97 16.03 6.02 3.69
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 8.00
517 4.04 0 2 97 16.03 6.02 3.69
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 8.88
511 4.43 0 2 97 16.03 6.02 3.69
0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 9.02
503 4.43 0 2 98 16.03 6.02 3.69

I installed PG 8.3 beta 3 to see if the behavior would be any
different. The CPU usage in general seemed higher in PG 8.3 beta 3,
and I still get the spike when disconnecting multiple clients. I tried
with default settings on 8.2.5 (except for a higher max_connections),
as well as with only a higher shared_buffers, and also with only a
higher checkpoint_segments. The CPU would still spike to 100 in all of
these cases, but it didn't seem to stay there as long as when
checkpoint_segments and shared_buffers are high. I suppose the only
difference may be when I'm killing pgbench.

I'm not sure if this is a bug with PostgreSQL or OS X 10.5.1. Any
suggestions on what I can do to narrow down the problem further would
be greatly appreciated.

Brian Wipf
ClickSpace Interactive Inc.
<brian(at)clickspace(dot)com>

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ed Burgstaler 2007-12-03 21:20:45 Older version of PGSQL help
Previous Message Scott Marlowe 2007-12-03 21:13:04 Re: postmaster logfile