Skip site navigation (1) Skip section navigation (2)

Re: pl/pgsql function spikes CPU 100%

From: Jeff Frost <jeff(at)frostconsultingllc(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: pl/pgsql function spikes CPU 100%
Date: 2007-04-24 06:18:20
Message-ID: Pine.LNX.4.64.0704232306500.9190@discord.home.frostconsultingllc.com (view raw or flat)
Thread:
Lists: pgsql-admin
On Fri, 16 Mar 2007, Jeff Frost wrote:

> On Fri, 16 Mar 2007, Jeff Frost wrote:
>
>> On Fri, 16 Mar 2007, Tom Lane wrote:
>> 
>>> Jeff Frost <jeff(at)frostconsultingllc(dot)com> writes:
>>>> ... Interestingly, when you
>>>> strace the backend, it doesn't appear to be doing too much...here's some
>>>> sample output:
>>> 
>>>> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
>>>> semop(3932217, 0x7fbfffd150, 1)         = 0
>>>> semop(3932217, 0x7fbfffd150, 1)         = 0
>>>> semop(3932217, 0x7fbfffd150, 1)         = 0
>>>> semop(3932217, 0x7fbfffd150, 1)         = 0
>>>> semop(3932217, 0x7fbfffd150, 1)         = 0
>>>> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
>>>> semop(3997755, 0x7fbfffd170, 1)         = 0
>>>> semop(3932217, 0x7fbfffd150, 1)         = 0
>>> 
>>> This looks suspiciously like the sort of trace we saw in the various
>>> "context swap storm" threads.  The test cases for those generally
>>> involved really tight indexscan loops, ie, the backends were spending
>>> all their time trying to access shared buffers.  If you haven't changed
>>> the function or the data, then I concur with the nearby worry about
>>> autovacuuming (large buildup of dead tuples could result in this symptom).
>>> Or maybe you've got an old open transaction that is blocking cleanup?
>> 
>> Tom,
>> 
>> I doubt it's a problem with autovacuum as the data in this server was just 
>> loaded an hour before the strace above was taken, so there should have been 
>> almost no dead tuples, especially since these tables are nearly write once. 
>> I.e. they get written to once, then the populate function updates them, 
>> then months later they get archived off.
>> 
>> There did not appear to be high context switch activity nor any IO wait to 
>> mention during the time I was watching the postmaster.  If it's worth 
>> mentioning, it's running CentOS 4.4 with the kernel-2.6.9-34.EL kernel.
>
> Oh, and I meant to mention that this query was the only thing in 
> pg_stat_activity during the painful time it was running, and there were no 
> ungranted locks in pg_locks.

Well, finally got this system upgraded to 8.1.8, but unfortunately, that did 
not resolve this.  Is there any reasonable way to see where this function is 
spending it's time?  The on disk size of the tables in question are about 
500MB each including the indexes.  The machine in question has 8GB of RAM and 
the total DB working set size is about 7GB.

-- 
Jeff Frost, Owner 	<jeff(at)frostconsultingllc(dot)com>
Frost Consulting, LLC 	http://www.frostconsultingllc.com/
Phone: 650-780-7908	FAX: 650-649-1954

In response to

Responses

pgsql-admin by date

Next:From: Tom LaneDate: 2007-04-24 06:33:13
Subject: Re: pl/pgsql function spikes CPU 100%
Previous:From: MageshwaranDate: 2007-04-24 03:32:52
Subject: Regarding WAL

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group