Re: 9.2.2 - semop hanging

From: Rafael Domiciano <rafael(dot)domiciano(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: 9.2.2 - semop hanging
Date: 2013-07-01 18:06:34
Message-ID: CAL0i5M4Wu_9mc03QtjemwTh5ri29GR48P7X8xkwebi1Z9k6jgA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello guys,

I've been trying to "hunting down" my problem and reached the following:

1) Emre Hasegeli has suggested to reduce my shared buffers, but it's
already low:
total server memory: 141 GB
shared_buffers: 16 GB

Maybe it's too low? I've been thinking to increase to 32 GB.

max_connections = 500 and ~400 connections average

2) Being "hanging" on "semop" I tried the following, as suggested on some
"tuning page" over web.

echo "250 32000 100 128" > /proc/sys/kernel/sem

3) I think my problem could be something related to "LwLocks", as I did
some googling and found some related problems and slides. There is some way
I can confirm this?

4) Rebooting the server didn't make any difference.

Appreciate any help,

Rafael

On Tue, Jun 11, 2013 at 9:48 AM, Rafael Domiciano <
rafael(dot)domiciano(at)gmail(dot)com> wrote:

> Hello all you guys,
>
> Since saturday I'm get stucked in a very strange situation: from time to
> time (sometimes with intervals less than 10 minutes), the server get
> "stucked"/"hang" (I dont know how to call it) and every connections on
> postgres (dont matter if it's SELECT, UPDATE, DELETE, INSERT, startup,
> authentication...) seems like get "paused"; after some seconds (say ~10 or
> ~15 sec, sometimes less) everything "goes OK".
>
> So, my first trial was to check disks. Running "iostat" apparently showed
> that disks was OK. It's a Raid10, 4 600GB SAS, IBM Storage DS3512, over FC.
> IBM DS Storage Manager says that disks is OK.
>
> Then, memory. Apparently no swap being used:
> [###(at)### data]# free -m
> total used free shared buffers cached
> Mem: 145182 130977 14204 0 43 121407
> -/+ buffers/cache: 9526 135655
> Swap: 6143 65 6078
>
> No error on /var/log/messages.
>
> Following, is some strace of one processes, and some others, maybe, useful
> infos. Every processes I've straced bring the same scenario: seems it get
> stucked on semop.
>
> There's no modification in server since last monday, that I changed
> pg_hba.conf to login in LDAP. The LDAP Server apparently is OK, and tcpdump
> doesnt show any slow on response, neither big activity on this port.
>
> Any help appreciate,
>
> [###(at)### ~]# strace -ttp 5209
> Process 5209 attached - interrupt to quit
> 09:01:54.122445 semop(2293765, {{15, -1, 0}}, 1) = 0
> 09:01:55.368785 semop(2293765, {{15, -1, 0}}, 1) = 0
> 09:01:55.368902 semop(2523148, {{11, 1, 0}}, 1) = 0
> 09:01:55.368978 semop(2293765, {{15, -1, 0}}, 1) = 0
> 09:01:55.369861 semop(2293765, {{15, -1, 0}}, 1) = 0
> 09:01:55.370648 semop(3047452, {{6, 1, 0}}, 1) = 0
> 09:01:55.370694 semop(2293765, {{15, -1, 0}}, 1) = 0
> 09:01:55.370762 semop(2785300, {{12, 1, 0}}, 1) = 0
> 09:01:55.370805 access("base/2048098929", F_OK) = 0
> 09:01:55.370953 open("base/2048098929/PG_VERSION", O_RDONLY) = 5
>
> [###(at)### data]# ipcs -l
>
> - Shared Memory Limits -
> max number of segments = 4096
> max seg size (kbytes) = 83886080
> max total shared memory (kbytes) = 17179869184
> min seg size (bytes) = 1
>
> ------ Semaphore Limits --------
> max number of arrays = 128
> max semaphores per array = 250
> max semaphores system wide = 32000
> max ops per semop call = 32
> semaphore max value = 32767
>
> ------ Messages: Limits --------
> max queues system wide = 32768
> max size of message (bytes) = 65536
> default max size of queue (bytes) = 65536
>
> [###(at)### data]# ipcs -u
> ----- Semaphore Status -------
> used arrays: 34
> allocated semaphores: 546
>
> [###(at)### data]# uname -a
> Linux ### 2.6.32-279.14.1.el6.x86_64 #1 SMP Tue Nov 6 23:43:09 UTC 2012
> x86_64 x86_64 x86_64 GNU/Linux
>
> postgres=# select version();
> version
>
> --------------------------------------------------------------------------------------------------------------
> PostgreSQL 9.2.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6
> 20120305 (Red Hat 4.4.6-4), 64-bit
> (1 registro)
>
> [###(at)### data]# cat /etc/redhat-release
> CentOS release 6.3 (Final)
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Niels Kristian Schjødt 2013-07-02 11:27:43 Fillfactor in postgresql 9.2
Previous Message Tom Lane 2013-06-27 21:44:09 Re: Partitions not Working as Expected