Re: long wait times in ProcessCatchupEvent()

From: Craig James <craig_james(at)emolecules(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: long wait times in ProcessCatchupEvent()
Date: 2010-12-29 19:28:25
Message-ID: 4D1B8BD9.4060105@emolecules.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 12/29/10 6:28 AM, Julian v. Bock wrote:
> I have the problem that on our servers it happens regularly under a
> certain workload (several times per minute) that all backend processes
> get a SIGUSR1 and spend several seconds in ProcessCatchupEvent(). At
> 100-200 connections (most of them idle) this causes the system load to
> skyrocket. I am not really familiar with the code but my wild guess is
> that the processes spend most of their time waiting for spinlocks.
>
> We have reduced the number of connections as much as possible for now
> but it still makes up for roughly 50% of the total CPU time. Has
> anyone experienced a similar problem?
>
> I can reproduce the issue on a test system with production data but it
> is not so easy to pinpoint what exactly causes the problem. The queries
> are basically tsearch2 full text searches over moderately big tables
> (~35GB). The queries are performed by functions which aggregate data
> from partitions in temporary tables, cache some data, and perform
> calculations before returning it to the user.
>
> The PostgreSQL version is 8.3.12, the test server has 8 amd64 cores
> and 16GB of ram. I experimented with shared_buffers between 1GB and
> 4GB but it doesn't make much of a difference. Disk IO doesn't seem to
> be an issue here.

This sounds like the exact same problem I had on Postgres 8.3 and 8.4:

http://archives.postgresql.org/pgsql-performance/2010-04/msg00071.php

Updating to Postgres version 9 fixed it. Here is what appeared to be the best analysis of what was happening, but we never confirmed it.

http://archives.postgresql.org/pgsql-performance/2010-06/msg00464.php

Craig

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2010-12-29 19:58:59 Re: long wait times in ProcessCatchupEvent()
Previous Message Tom Lane 2010-12-29 15:18:41 Re: long wait times in ProcessCatchupEvent()