CPU causes 100% load in user space when ntp client runs and postgresql is under heavy load

From: Dennis Brouwer <dennis(dot)brouwer(at)m4n(dot)nl>
To: pgsql-admin(at)postgresql(dot)org
Subject: CPU causes 100% load in user space when ntp client runs and postgresql is under heavy load
Date: 2012-09-24 13:53:10
Message-ID: CAMfP6GYL4Tkga5MKO9=KR5n9Ht_4Nk3Qk4_K0g8Yug09FoJo7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Dear mailing list,

I am currently benching postgresq-9.2 using debain squeeze (Linux
2.6.32-5-amd64 x86_64 GNU/Linux).

The server used for benching is a quad core E5-1620, 32 GB RAM and for
storage we use and LSI-9265 with 8 SSDs. The database freshly restored is
about 90GB in size and doesn't fit in RAM in order to test the IO system.

The database mainly consists of a partitioned table with 6 partitions. In
order to test the performance I run 32 queries in parallel doing some
grouping queries on the partitioned table. Every query runs in its own
transaction. While the number of concurrent queries run may be higher then
recommended we consider this a stress test as well.

Last week I was repeatedly able to run all these tests on the database
without any issue but recently, all of a sudden at random, some of the
queries performed a factor 100 less. It may take hours to complete the
transaction. At the same moment we see a dramatic decrease in IO and the
CPU is nearly 100% busy in user space.

After days of testing I may have found the cause: the ntp client. If I stop
the ntp client the problem vanishes.

I have started reading on spinlocks and other related material but this all
is rather complicated stuff and kindly ask in what direction I should
search. The issue can be reproduced for both postgresql-9.1 and
postgresql-9.2 and perhaps can be rephrased as: Very high CPU load in user
space (at random) with ntp enabled and (long?) running transactions.

Perhaps somebody from the mailing list has sufficient experience debugging
this kind of behaviour to exclude a bug in postgresql. Much appreciated!

Very kind regards,

Dennis Brouwer
M4N

P.S. If required I can provide more details like: the queries, auto_explain
output, iostat, top, iotop, postgresql.conf etc etc.

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Rural Hunter 2012-09-24 13:59:02 Re: [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
Previous Message Vincent Dautremont 2012-09-24 13:38:12 Re: Windows Services and Postgresql 9.1.3