Re: Wierd context-switching issue on Xeon

From: ohp(at)pyrenet(dot)fr
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: josh(at)agliodbs(dot)com, Joe Conway <mail(at)joeconway(dot)com>, "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, lutzeb(at)aeccom(dot)com, pgsql-performance(at)postgresql(dot)org, Neil Conway <neilc(at)samurai(dot)com>
Subject: Re: Wierd context-switching issue on Xeon
Date: 2004-04-21 11:18:53
Message-ID: Pine.UW2.4.53.0404211315000.9232@server.pyrenet.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

How long is this test supposed to run?

I've launched just 1 for testing, the plan seems horrible; the test is cpu
bound and hasn't finished yet after 17:02 min of CPU time, dual XEON 2.6G
Unixware 713

The machine is a Fujitsu-Siemens TX 200 server
On Mon, 19 Apr 2004, Tom Lane wrote:

> Date: Mon, 19 Apr 2004 20:01:56 -0400
> From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> To: josh(at)agliodbs(dot)com
> Cc: Joe Conway <mail(at)joeconway(dot)com>, scott.marlowe <scott(dot)marlowe(at)ihs(dot)com>,
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, lutzeb(at)aeccom(dot)com,
> pgsql-performance(at)postgresql(dot)org, Neil Conway <neilc(at)samurai(dot)com>
> Subject: Re: [PERFORM] Wierd context-switching issue on Xeon
>
> Here is a test case. To set up, run the "test_setup.sql" script once;
> then launch two copies of the "test_run.sql" script. (For those of
> you with more than two CPUs, see whether you need one per CPU to make
> trouble, or whether two test_runs are enough.) Check that you get a
> nestloops-with-index-scans plan shown by the EXPLAIN in test_run.
>
> In isolation, test_run.sql should do essentially no syscalls at all once
> it's past the initial ramp-up. On a machine that's functioning per
> expectations, multiple copies of test_run show a relatively low rate of
> semop() calls --- a few per second, at most --- and maybe a delaying
> select() here and there.
>
> What I actually see on Josh's client's machine is a context swap storm:
> "vmstat 1" shows CS rates around 170K/sec. strace'ing the backends
> shows a corresponding rate of semop() syscalls, with a few delaying
> select()s sprinkled in. top(1) shows system CPU percent of 25-30
> and idle CPU percent of 16-20.
>
> I haven't bothered to check how long the test_run query takes, but if it
> ends while you're still examining the behavior, just start it again.
>
> Note the test case assumes you've got shared_buffers set to at least
> 1000; with smaller values, you may get some I/O syscalls, which will
> probably skew the results.
>
> regards, tom lane
>
>

--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp(at)pyrenet(dot)fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Edoardo Ceccarelli 2004-04-21 11:50:04 Re: slow seqscan
Previous Message Christopher Kings-Lynne 2004-04-21 10:15:29 Re: slow seqscan