Issue with Linux+Pentium SMP Context Switching

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Issue with Linux+Pentium SMP Context Switching
Date: 2003-12-19 18:30:13
Message-ID: 200312191030.13499.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Folks,

I brought up this issue a couple of weeks ago on the Performance list. Since
then, I've gotten e-mail confirmation from a few other users seeing this
problem. Here's the shape of the problem, we just don't know what causes it.
I've been trying to do some profiling, but since I only have production
systems to work with it's been really slow -- I have to wait for weekly
downtime for each test. I'm hoping that someone with a greater knowledge
of Linux Kernel internals and a good test machine can help out.

Linux Versions Reported: RH and Gentoo reported, Kernels 2.4.18 to 2.4.22
Not tested on other distros/kernels. Kernels are SMP-enabled.
Hardware: Intel Pentium III and 4 dual-processor systems. 5 of the 6
reported machines are made by Dell; the other is a home-build.
Demonstrated on both hyper-threaded and non-hyperthreaded Xeons;
Cannot be reproduced on Athalons.
Description of the Problem:
When a query is made against a table with millions of rows that requires a
seq scan, large hash join, per-row calculations or other intensive operation,
the system climbs to tens or hundreds of thousands of context switches per
second (contrast with, for example, 5000cs/second on AthalonMP). This hurts
performance significantly, possibly up to doubling query execution time.
Initial debug logging of a test on one Xeon system demonstrating this issue
showed a very large number of unattributed semop() calls. We are still
following up on this.

In discussions with Linux kernel hackers online, they blame the way that
PostgreSQL uses shared memory. Whether or not they are correct, the effect
of the issue is to harm PostgreSQL's performance and make us look bad on one
of the major "enterprise" systems of choice: the multi-processor Xeon system.

Ideas, anyone?

--
Josh Berkus
Aglio Database Solutions
San Francisco

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Treat 2003-12-19 18:41:15 Re: PostgreSQL speakers needed for OSCON 2004
Previous Message Josh Berkus 2003-12-19 18:14:29 Re: Proposed Query Planner TODO items