Re: sblock state on FreeBSD 6.1

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: sblock state on FreeBSD 6.1
Date: 2006-05-03 15:22:42
Message-ID: 20060503152242.GU97354@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 02, 2006 at 11:06:59PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
> > Just experienced a server that was spending over 50% of CPU time in the
> > system, apparently dealing with postmasters that were in the sblock
> > state. Looking at the FreeBSD source, this indicates that the process is
> > waiting for a lock on a socket. During this time the machine was doing
> > nearly 200k context switches a second.
>
> Which operations require such a lock? If plain read/write needs the
> lock then heavy contention is hardly surprising.

From what little I've been able to decypher of the FBSD kernel source,
it appears that socket creation and destruction requires the lock, as
well as (at least) writing to it, but in the latter case it depends on
some flags/options.

> > Any ideas what areas of the code could be locking a socket?
> > Theoretically it shouldn't be the stats collector, and the site is using
> > pgpool as a connection pool, so this shouldn't be due to trying to
> > connect to backends at a furious rate.
>
> Actually, the stats socket seems like a really good bet to me, since all
> the backends will be interested in the same socket. The
> client-to-backend sockets are only touched by two processes each, so
> don't seem like big contention sources.

Do we take specific steps to ensure that we don't block when attempting
to write to these sockets? I *think* there's a flag that's associated
with the socket descriptor that determines locking behavior, but I
haven't been able to find a great deal of info.

Right now the only way I can think of to try and reproduce this is to
modify the code so that we're passing a much larger amount of data to
the stats logger and then fire up pgbench. But I know there's been some
discussion about changing things so that we won't drop stats messages,
so maybe it's a moot point.

BTW, this server does have command string logging on, so if this is a
stats issue that probably made the problem worse. Would it be practical
to have backends only log the command string if the command runs for
more than some number of milliseconds? I suspect there's very little
case for actually trying to log every single command, so realistically
people are only going to care about commands that are taking a decent
amount of time.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message hubert depesz lubaczewski 2006-05-03 17:06:09 inclusion of hstore software in main tarball
Previous Message Jim C. Nasby 2006-05-03 15:12:28 Re: Is a SERIAL column a "black box", or not?