Runaway backend at 100% CPU PostgreSQL v8.3.5

From: "Stephen R(dot) van den Berg" <srb(at)cuci(dot)nl>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Runaway backend at 100% CPU PostgreSQL v8.3.5
Date: 2009-01-06 23:11:34
Message-ID: 20090106231134.GA13379@cuci.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I'm running Debian PostgreSQL v8.3.5-1 on x86 in 32-bit mode.
Every once in a while, some backends start taking 100% CPU, as can be seen
below in the excerpt from the process table:

27256 ? Ss 0:04 /usr/lib/postgresql/8.3/bin/postgres -D /var/lib/post
27299 ? Ss 0:00 \_ postgres: writer process
27300 ? Ss 0:00 \_ postgres: wal writer process
27301 ? Ss 0:00 \_ postgres: autovacuum launcher process
27302 ? Ss 0:26 \_ postgres: stats collector process
12076 ? Rs 5681:50 \_ postgres: cms cms 10.0.0.5(59125) PARSE
12853 ? Rs 5549:25 \_ postgres: cms cms 10.0.0.5(48437) PARSE
25985 ? Ss 0:00 \_ postgres: cms cms 10.0.0.18(45807) idle
25986 ? Ss 0:00 \_ postgres: cms cms 10.0.0.18(45808) idle

It seems that the backend is stuck in some kind of endless loop. Since
it's a production Debian server, the backend is not compiled with debugging
turned on. The best I can do is ltrace it, in hopes of someone recognising
the infinite sequence.
Ltracing the 12853 process reveals the following libc calls:

Label a:
[pid 12853] __sigsetjmp(0xbfec7000, 0, 0x852e740, 0xb5b3fc16, 0x85e9228) = 0
[pid 12853] strlen("ExecutorState") = 13
[pid 12853] strcpy(0x84fb0e8, "ExecutorState") = 0x84fb0e8
[pid 12853] malloc(8192) = 0x860da48
[pid 12853] strlen("ExprContext") = 11
[pid 12853] strcpy(0x84fb060, "ExprContext") = 0x84fb060
[pid 12853] strlen("ExprContext") = 11
[pid 12853] strcpy(0x84fbda0, "ExprContext") = 0x84fbda0
[pid 12853] strncpy(0x860e498, "parent", 64) = 0x860e498
[pid 12853] strncpy(0x860e500, "sid", 64) = 0x860e500
[pid 12853] strlen("ExprContext") = 11
[pid 12853] strcpy(0x85cd810, "ExprContext") = 0x85cd810
[pid 12853] malloc(8256) = 0x8600a28
[pid 12853] memmove(0x860e738, 0x860e600, 40, 0, 0x82eada9) = 0x860e738
[pid 12853] strncpy(0x860e7f0, "parent", 64) = 0x860e7f0
[pid 12853] strncpy(0x860e858, "sid", 64) = 0x860e858
[pid 12853] strlen("SPI TupTable") = 12
[pid 12853] strcpy(0x85cd1b0, "SPI TupTable") = 0x85cd1b0
[pid 12853] malloc(8192) = 0x860fa50
[pid 12853] memmove(0x860e738, 0x860e600, 40, 0x830fab4, 0x84fb088) = 0x860e738
[pid 12853] free(0x8600a28) = <void>
[pid 12853] free(0x860da48) = <void>
[pid 12853] free(0x860fa50) = <void>
[pid 12853] __sigsetjmp(0xbfec7000, 0, 0x852e740, 0xb5b3fc16, 0x85e9228) = 0
[pid 12853] strlen("ExecutorState") = 13
[pid 12853] strcpy(0x85cd1b0, "ExecutorState") = 0x85cd1b0
[pid 12853] malloc(8192) = 0x860da48
[pid 12853] strlen("ExprContext") = 11
[pid 12853] strcpy(0x84fb0e8, "ExprContext") = 0x84fb0e8
[pid 12853] strlen("ExprContext") = 11
[pid 12853] strcpy(0x84fb060, "ExprContext") = 0x84fb060
[pid 12853] strncpy(0x860e498, "parent", 64) = 0x860e498
[pid 12853] strncpy(0x860e500, "sid", 64) = 0x860e500
[pid 12853] strlen("ExprContext") = 11
[pid 12853] strcpy(0x84fbda0, "ExprContext") = 0x84fbda0
[pid 12853] malloc(8256) = 0x8600a28
[pid 12853] memmove(0x860e738, 0x860e600, 40, 0, 0x82eada9) = 0x860e738
[pid 12853] strncpy(0x860e7f0, "parent", 64) = 0x860e7f0
[pid 12853] strncpy(0x860e858, "sid", 64) = 0x860e858
[pid 12853] strlen("SPI TupTable") = 12
[pid 12853] strcpy(0x85cd810, "SPI TupTable") = 0x85cd810
[pid 12853] malloc(8192) = 0x860fa50
[pid 12853] memmove(0x860e738, 0x860e600, 40, 0x830fab4, 0x85cd150) = 0x860e738
[pid 12853] free(0x8600a28) = <void>
[pid 12853] free(0x860da48) = <void>
[pid 12853] free(0x860fa50) = <void>
[pid 12853] __sigsetjmp(0xbfec7000, 0, 0x852e740, 0xb5b3fc16, 0x85e9228) = 0
[pid 12853] strlen("ExecutorState") = 13
[pid 12853] strcpy(0x85cd810, "ExecutorState") = 0x85cd810
[pid 12853] malloc(8192) = 0x860da48
[pid 12853] strlen("ExprContext") = 11
[pid 12853] strcpy(0x85cd1b0, "ExprContext") = 0x85cd1b0
[pid 12853] strlen("ExprContext") = 11
[pid 12853] strcpy(0x84fb0e8, "ExprContext") = 0x84fb0e8
[pid 12853] strncpy(0x860e498, "parent", 64) = 0x860e498
[pid 12853] strncpy(0x860e500, "sid", 64) = 0x860e500
[pid 12853] strlen("ExprContext") = 11
[pid 12853] strcpy(0x84fb060, "ExprContext") = 0x84fb060
[pid 12853] malloc(8256) = 0x8600a28
[pid 12853] memmove(0x860e738, 0x860e600, 40, 0, 0x82eada9) = 0x860e738
[pid 12853] strncpy(0x860e7f0, "parent", 64) = 0x860e7f0
[pid 12853] strncpy(0x860e858, "sid", 64) = 0x860e858
[pid 12853] strlen("SPI TupTable") = 12
[pid 12853] strcpy(0x84fbda0, "SPI TupTable") = 0x84fbda0
[pid 12853] malloc(8192) = 0x860fa50
[pid 12853] memmove(0x860e738, 0x860e600, 40, 0x830fab4, 0x85cd7b0) = 0x860e738
[pid 12853] free(0x8600a28) = <void>
[pid 12853] free(0x860da48) = <void>
[pid 12853] free(0x860fa50) = <void>
[pid 12853] __sigsetjmp(0xbfec7000, 0, 0x852e740, 0xb5b3fc16, 0x85e9228) = 0
[pid 12853] strlen("ExecutorState") = 13
[pid 12853] strcpy(0x84fbda0, "ExecutorState") = 0x84fbda0
[pid 12853] malloc(8192) = 0x860da48
[pid 12853] strlen("ExprContext") = 11
[pid 12853] strcpy(0x85cd810, "ExprContext") = 0x85cd810
[pid 12853] strlen("ExprContext") = 11
[pid 12853] strcpy(0x85cd1b0, "ExprContext") = 0x85cd1b0
[pid 12853] strncpy(0x860e498, "parent", 64) = 0x860e498
[pid 12853] strncpy(0x860e500, "sid", 64) = 0x860e500
[pid 12853] strlen("ExprContext") = 11
[pid 12853] strcpy(0x84fb0e8, "ExprContext") = 0x84fb0e8
[pid 12853] malloc(8256) = 0x8600a28
[pid 12853] memmove(0x860e738, 0x860e600, 40, 0, 0x82eada9) = 0x860e738
[pid 12853] strncpy(0x860e7f0, "parent", 64) = 0x860e7f0
[pid 12853] strncpy(0x860e858, "sid", 64) = 0x860e858
[pid 12853] strlen("SPI TupTable") = 12
[pid 12853] strcpy(0x84fb060, "SPI TupTable") = 0x84fb060
[pid 12853] malloc(8192) = 0x860fa50
[pid 12853] memmove(0x860e738, 0x860e600, 40, 0x830fab4, 0x84fbd40) = 0x860e738
[pid 12853] free(0x8600a28) = <void>
[pid 12853] free(0x860da48) = <void>
[pid 12853] free(0x860fa50) = <void>
[pid 12853] __sigsetjmp(0xbfec7000, 0, 0x852e740, 0xb5b3fc16, 0x85e9228) = 0
[pid 12853] strlen("ExecutorState") = 13
[pid 12853] strcpy(0x84fb060, "ExecutorState") = 0x84fb060
[pid 12853] malloc(8192) = 0x860da48
[pid 12853] strlen("ExprContext") = 11
[pid 12853] strcpy(0x84fbda0, "ExprContext") = 0x84fbda0
[pid 12853] strlen("ExprContext") = 11
[pid 12853] strcpy(0x85cd810, "ExprContext") = 0x85cd810
[pid 12853] strncpy(0x860e498, "parent", 64) = 0x860e498
[pid 12853] strncpy(0x860e500, "sid", 64) = 0x860e500
[pid 12853] strlen("ExprContext") = 11
[pid 12853] strcpy(0x85cd1b0, "ExprContext") = 0x85cd1b0
[pid 12853] malloc(8256) = 0x8600a28
[pid 12853] memmove(0x860e738, 0x860e600, 40, 0, 0x82eada9) = 0x860e738
[pid 12853] strncpy(0x860e7f0, "parent", 64) = 0x860e7f0
[pid 12853] strncpy(0x860e858, "sid", 64) = 0x860e858
[pid 12853] strlen("SPI TupTable") = 12
[pid 12853] strcpy(0x84fb0e8, "SPI TupTable") = 0x84fb0e8
[pid 12853] malloc(8192) = 0x860fa50
[pid 12853] memmove(0x860e738, 0x860e600, 40, 0x830fab4, 0x84fb000) = 0x860e738
[pid 12853] free(0x8600a28) = <void>
[pid 12853] free(0x860da48) = <void>
[pid 12853] free(0x860fa50) = <void>
Jump back to label a:, repeat ad infinitum, chewing up 100% CPU

Below a complete process list of all running backends:

27256 ? Ss 0:04 /usr/lib/postgresql/8.3/bin/postgres -D /var/lib/post
27299 ? Ss 0:00 \_ postgres: writer process
27300 ? Ss 0:00 \_ postgres: wal writer process
27301 ? Ss 0:00 \_ postgres: autovacuum launcher process
27302 ? Ss 0:26 \_ postgres: stats collector process
12076 ? Rs 5681:50 \_ postgres: cms cms 10.0.0.5(59125) PARSE
12853 ? Ts 5549:25 \_ postgres: cms cms 10.0.0.5(48437) PARSE
25985 ? Ss 0:00 \_ postgres: cms cms 10.0.0.18(45807) idle
25986 ? Ss 0:00 \_ postgres: cms cms 10.0.0.18(45808) idle
21100 ? Ss 0:00 \_ postgres: cms cms 10.0.0.12(45544) idle
21101 ? Ss 0:00 \_ postgres: cms cms 10.0.0.12(45545) idle
26777 ? Ss 0:00 \_ postgres: cms cms 10.0.0.12(43123) idle
24711 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(34684) idle
24715 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(34687) idle
24716 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(34688) idle
24721 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(34693) idle
8355 ? Ss 0:01 \_ postgres: cms cms 10.0.0.5(59188) idle
13504 ? Ss 0:05 \_ postgres: cms cms 10.0.0.5(35244) idle
17568 ? Ss 0:01 \_ postgres: cms cms 10.0.0.5(50414) idle
17615 ? Ss 0:02 \_ postgres: cms cms 10.0.0.5(42313) idle
17617 ? Ss 0:01 \_ postgres: cms cms 10.0.0.5(42315) idle
20412 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(43902) idle
20425 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(43905) idle
20427 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(43907) idle
20428 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(43908) idle
20429 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(43909) idle
20430 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(43910) idle
20431 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(43911) idle

--
Sincerely,
Stephen R. van den Berg.

"Very funny, Mr. Scott. Now beam down my clothes!"

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2009-01-06 23:16:57 Re: Re: [COMMITTERS] pgsql: This makes all the \dX commands (most importantly to most: \df)
Previous Message Bruce Momjian 2009-01-06 23:08:34 Re: Re: [COMMITTERS] pgsql: This makes all the \dX commands (most importantly to most: \df)