Quick Links

Re: [HACKERS] Postmaster dies with many child processes (spinlock/semget failed)

From:	Patrick Verdon <patrick(at)kan(dot)co(dot)uk>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: [HACKERS] Postmaster dies with many child processes (spinlock/semget failed)
Date:	1999-01-29 16:05:28
Message-ID:	36B1DC48.8C52FD92@kan.co.uk
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Tatsuo, Vadim, Oleg, Scrappy,

Many thanks for the response.

A couple of you weren't convinced that this
is a Postgres problem so let me try to clear
the water a little bit. Maybe the use of
Apache and mod_perl is confusing the issue -
the point I was trying to make is that if
there are 49+ concurrent postgres processes
on a normal machine (i.e. where kernel
parameters are the defaults, etc.) the
postmaster dies in a nasty way with
potentially damaging results.

Here's a case without Apache/mod_perl that
causes exactly the same behaviour. Simply
enter the following 49 times:

kandinsky:patrick> psql template1 &

Note that I tried to automate this without
success:

perl -e 'for ( 1..49 ) { system("/usr/local/pgsql/bin/psql template1 &"); }'

The 49th attempt to initiate a connection
fails:

Connection to database 'template1' failed.
pqReadData() -- backend closed the channel unexpectedly.
This probably means the backend terminated abnormally before or while processing the request.

and the error_log says:

InitPostgres
IpcSemaphoreCreate: semget failed (No space left on device) key=5432017, num=16, permission=600
proc_exit(3) [#0]
shmem_exit(3) [#0]
exit(3)
/usr/local/pgsql/bin/postmaster: reaping dead processes...
/usr/local/pgsql/bin/postmaster: CleanupProc: pid 1521 exited with status 768
/usr/local/pgsql/bin/postmaster: CleanupProc: sending SIGUSR1 to process 1518
NOTICE: Message from PostgreSQL backend:
The Postmaster has informed me that some other backend died abnormally and possibly corrupted shared memory.
I have rolled back the current transaction and am going to terminate your database system connection and exit.
Please reconnect to the database system and repeat your query.

FATAL: s_lock(dfebe065) at spin.c:125, stuck spinlock. Aborting.

Even if there is a hard limit there is no way that
Postgres should die in this spectacular fashion.
I wouldn't have said that it was unreasonable for
some large applications to peak at >48 processes
when using powerful hardware with plenty of RAM.

The other point is that even if one had 1 GB RAM,
Postgres won't scale beyond 48 processes, using
probably less than 100 MB of RAM. Would it be
possible to make the 'MaxBackendId' configurable
for those who have the resources?

I have reproduced this behaviour on both
FreeBSD 2.2.8 and Intel Solaris 2.6 using
version 6.4.x of PostgreSQL.

I'll try to change some of the parameters
suggested and see how far I get but the bottom
line is Postgres shouldn't be dying like this.

Let me know if you need any more info.

Cheers.

Patrick

#===============================#
\ KAN Design & Publishing Ltd /
/ T: +44 (0)1223 511134 \
\ F: +44 (0)1223 571968 /
/ E: mailto:patrick(at)kan(dot)co(dot)uk \
\ W: http://www.kan.co.uk /
#===============================#

Responses

Re: [HACKERS] Postmaster dies with many child processes (spinlock/semget failed) at 1999-01-29 17:02:43 from Hannu Krosing
Re: [HACKERS] Postmaster dies with many child processes (spinlock/semget failed) at 1999-01-30 08:05:50 from The Hermit Hacker

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	1999-01-29 16:21:15	Re: [HACKERS] Postgres Speed or lack thereof
Previous Message	Oleg Broytmann	1999-01-29 15:54:15	VACUUM ANALYZE failed on linux