Re: Timing overhead and Linux clock sources

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Timing overhead and Linux clock sources
Date: 2011-12-10 01:26:12
Message-ID: 4EE2B534.40502@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/09/2011 06:48 PM, Ants Aasma wrote:
> The attached test program (test_gettimeofday_monotonic) shows that one
> test loop iteration takes 34ns with tsc and 1270ns with hpet.
>

This test program is great, I've wanted this exact sort of visibility
into this problem for years. I've toyed with writing something like
this and then seeing what results it returns on all of the build farm
animals. For now I can just run it on all the hardware I have access
to, which is not a small list.

I think we should bundle this up similarly to test_fsync, document some
best practices on time sources, and then point the vague warning about
EXPLAIN overhead toward that. Then new sources of timing overhead can
point there too. Much like low-level fsync timing, there's nothing
PostgreSQL can do about it, the best we can do is provide a program to
help users classify a system as likely or unlikely to run to have
high-overhead timing. I can make the needed docs changes, this is
resolving a long standing issue impacting code I wanted to add.

Rough guideline I'm seeing shape up is that <50ns is unlikely to cause
clock timing to be a significant problem, >500ns certainly is, and
values in the middle should concern people but not necessarily
invalidate the timing data collected.

I just confirmed that switching the clock source by echoing a new value
into /sys/devices/system/clocksource/clocksource0/current_clocksource
works (on the 2.6.32 kernel at least). What we want to see on a good
server is a result that looks like this, from my Intel i7-860 system:

$ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
tsc
$ ./test_gettimeofday_monotonic 5
Per loop: 39.30 ns
usec: count percent
4: 6 0.00000%
2: 104 0.00008%
1: 4999760 3.92983%
0: 122226109 96.07009%

Here's how badly that degrades if I use one of the alternate time sources:

# echo acpi_pm >
/sys/devices/system/clocksource/clocksource0/current_clocksource
$ ./test_gettimeofday_monotonic 5
Per loop: 727.65 ns
usec: count percent
16: 1 0.00001%
8: 0 0.00000%
4: 1233 0.01794%
2: 699 0.01017%
1: 4992622 72.65764%
0: 1876879 27.31423%

echo hpet > /sys/devices/system/clocksource/clocksource0/current_clocksource
$ ./test_gettimeofday_monotonic 5
Per loop: 576.96 ns
usec: count percent
8: 2 0.00002%
4: 1273 0.01469%
2: 767 0.00885%
1: 4993028 57.61598%
0: 3670977 42.36046%

Switching to the Intel T7200 CPU in my T60 laptop only provides the poor
quality time sources, not TSC, and results show timing is really slow there:

$ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
hpet
$ ./test_gettimeofday_monotonic 5
Per loop: 1019.60 ns
usec: count percent
256: 2 0.00004%
128: 3 0.00006%
64: 90 0.00184%
32: 23 0.00047%
16: 92 0.00188%
8: 1246 0.02541%
4: 34 0.00069%
2: 136154 2.77645%
1: 4700772 95.85818%
0: 65466 1.33498%

# echo acpi_pm >
/sys/devices/system/clocksource/clocksource0/current_clocksource
$ ./test_gettimeofday_monotonic 5
Per loop: 1864.66 ns
usec: count percent
256: 2 0.00007%
128: 0 0.00000%
64: 3 0.00011%
32: 6 0.00022%
16: 90 0.00336%
8: 1741 0.06493%
4: 2062 0.07690%
2: 2260601 84.30489%
1: 416954 15.54952%
0: 0 0.00000%

This seems to be measuring exactly the problem I only had a hand-wave
"it's bad on this hardware" explanation of before.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-12-10 03:32:09 Why do regression tests now leave "regress_test_role_super" around?
Previous Message Ants Aasma 2011-12-09 23:48:45 Re: Timing overhead and Linux clock sources