Re: LWLock contention: I think I understand the problem

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org, jwbaker(at)acm(dot)org
Subject: Re: LWLock contention: I think I understand the problem
Date: 2002-01-03 07:55:26
Message-ID: 200201030755.g037tQH23540@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-odbc

Tatsuo Ishii wrote:
> > I have thought of a further refinement to the patch I produced
> > yesterday. Assume that there are multiple waiters blocked on (eg)
> > BufMgrLock. After we release the first one, we want the currently
> > running process to be able to continue acquiring and releasing the lock
> > for as long as its time quantum holds out. But in the patch as given,
> > each acquire/release cycle releases another waiter. This is probably
> > not good.
> >
> > Attached is a modification that prevents additional waiters from being
> > released until the first release has a chance to run and acquire the
> > lock. Would you try this and see if it's better or not in your test
> > cases? It doesn't seem to help on a single CPU, but maybe on multiple
> > CPUs it'll make a difference.
> >
> > To try to make things simple, I've attached the mod in two forms:
> > as a diff from current CVS, and as a diff from the previous patch.
>
> Ok, here is a pgbench (-s 10) result on an AIX 5L box (4 way).
>
> "7.2 with patch" is for the previous patch. "7.2 with patch (revised)"
> is for the this patch. I see virtually no improvement. Please note
> that xy axis are now in log scale.

Well, there is clearly some good news in that graph. The unpatched 7.2
had _terrible_ performance for a few users. The patch clearly helped
that.

Both the 7.2 with patch tests show much better performance, close to
7.1. Interestingly the first 7.2 patch shows better performance than
the later one, perhaps because it is a 4-way system and maybe it is
faster to start up more waiting backends on such a system, but the
performance difference is minor.

I guess what really bothers me now is why the select() in 7.1 wasn't
slower than it was. We made 7.2 especially for multicpu systems, and
here we have identical performance to 7.1. Tatsuo, is AIX capable of
<10 millisecond sleeps? I see there is such a program in the archives
from Tom Lane:

http://fts.postgresql.org/db/mw/msg.html?mid=1217731

Tatsuo, can you run that program on the AIX box and tell us what it
reports? It would not surprise me if AIX supported sub-10ms select()
timing because I have heard AIX is a mixing of Unix and IBM mainframe
code.

I have attached a clean version of the code because the web mail archive
munged the C code. I called it tst1.c. If you compile it and run it
like this:

#$ time tst1 1

real 0m10.013s
user 0m0.000s
sys 0m0.004s

This runs select(1) 1000 times, meaning 10ms per select for BSD/OS.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

Attachment Content-Type Size
unknown_filename text/plain 360 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2002-01-03 08:00:44 Updated TODO item
Previous Message Marko Kreen 2002-01-03 07:36:05 Re: pgcryto failures on freebsd/alpha

Browse pgsql-odbc by date

  From Date Subject
Next Message Tatsuo Ishii 2002-01-03 09:00:10 Re: LWLock contention: I think I understand the problem
Previous Message Bruce Momjian 2002-01-03 07:20:16 Re: LWLock contention: I think I understand the problem