BDR, wal sender, high system cpu, mutex_lock_common

From: milist ujang <ujang(dot)milist(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: BDR, wal sender, high system cpu, mutex_lock_common
Date: 2017-10-03 11:45:17
Message-ID: CACG9ogw8Km5LFSVSQhgWOj0xcyHeip4bvp9+FXGWOtkS7mwQHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi all,

I've an environment 9.4 + bdr:
PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Debian
4.7.2-5) 4.7.2, 64-bit

kernel version:
3.2.0-4-amd64 #1 SMP Debian 3.2.65-1 x86_64 GNU/Linux

This is consolidation databases, in this machine there are around 250+ wal
sender processes.

top output revealed high system cpu:
%Cpu(s): 1.4 us, 49.7 sy, 0.0 ni, 48.8 id, 0.0 wa, 0.0 hi, 0.0 si,
0.0 st

profiling cpu with perf:

perf top -e cpu-clock

Events: 142K cpu-clock
82.37% [kernel] [k] __mutex_lock_common.isra.5
4.49% [kernel] [k] do_raw_spin_lock
2.23% [kernel] [k] mutex_lock
2.16% [kernel] [k] mutex_unlock
2.12% [kernel] [k] arch_local_irq_restore
1.73% postgres [.] ValidXLogRecord
0.87% [kernel] [k] __mutex_unlock_slowpath
0.78% [kernel] [k] arch_local_irq_enable
0.63% [kernel] [k] sys_recvfrom

finally get which processes (wal senders) that are using mutexes:

perf top -e task-clock -p 55382

Events: 697 task-clock
88.08% [kernel] [k] __mutex_lock_common.isra.5
3.27% [kernel] [k] do_raw_spin_lock
2.34% [kernel] [k] arch_local_irq_restore
2.10% postgres [.] ValidXLogRecord
1.87% [kernel] [k] mutex_unlock
1.87% [kernel] [k] mutex_lock
0.47% [kernel] [k] sys_recvfrom

strace output of wal sender process:

% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
98.30 1.030072 5 213063 201463 read
1.69 0.017686 0 201464 201464 recvfrom
0.01 0.000110 0 806 lseek
0.00 0.000043 0 474 468 rt_sigreturn
0.00 0.000000 0 6 open
0.00 0.000000 0 6 close
------ ----------- ----------- --------- --------- ----------------
100.00 1.047911 415819 403395 total

strace detail, majority read from pipe and recvfrom from socket (but most
of them are EAGAIN):

read(15, "~\320\5\0\1\0\0\0\0(at)\235\1\360\16\0\0\334\26\0\0\0\0\0\0\365\27\0\0\0\0\0\0"...,
8192) = 8192 <0.000025>
read(6, 0x7fffdd837b3f, 1) = -1 EAGAIN (Resource temporarily unavailable)
<0.000116>
recvfrom(10, 0x7fffdd837b17, 1, 0, 0, 0) = -1 EAGAIN (Resource temporarily
unavailable) <0.000049>

ls -l /proc/62388/fd/15
lr-x------ 1 postgres postgres 64 Oct 1 08:39 /proc/62388/fd/15 ->
/data/pg_xlog/0000000100000EF000000061

ls -l /proc/62388/fd/6
lr-x------ 1 postgres postgres 64 Oct 1 08:39 /proc/62388/fd/6 ->
pipe:[1090892506]

ls -l /proc/62388/fd/10
lrwx------ 1 postgres postgres 64 Oct 1 08:39 /proc/62388/fd/10 ->
socket:[1096584060]

I wonder, is there kernel version has better handling mutexes?
Or is it the expected behavior?

Sorry for cross-posting, I have posted the same on pgsql-performance too....

--
regards

ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Craig Ringer 2017-10-03 13:49:55 Re: BDR, wal sender, high system cpu, mutex_lock_common
Previous Message Laurenz Albe 2017-10-03 07:38:31 Re: Checkpoint write time - anything unusual?