Hi,
Intel recommends to add a special pause instruction into spinlock busy
loops. It's necessary for hyperthreading - without it, the cpu can't
figure out that a logical thread does no useful work and incorrectly
awards lots of execution resources to that thread. Additionally, it's
supposed to reduce the time the cpu needs to recover from the
(mispredicted) branch after the spinlock was obtained.
The attached patch adds a new platform hook and implements it for i386.
The new instruction is backward compatible, thus no cpu detection is
necessary.
Additionally I've increased the number of loops from 100 to 1000 - a 3
GHz Pentium 4 might execute 100 loops faster than a single bus
transaction. I don't know if this change is appropriate for all
platforms, or if SPINS_PER_DELAY should be made platform specific.
Mark did a test run with his dbt-2 benchmark on a 4-way Xeon with HT
enabled, and the patch resulted in a 10% performance increase:
Before:
http://developer.osdl.org/markw/dbt2-pgsql/284/
After:
http://developer.osdl.org/markw/dbt2-pgsql/300/
--
Manfred