Skip site navigation (1) Skip section navigation (2)

Re: LWLock statistics collector (was: CSStorm occurred again by postgreSQL8.2)

From: Katsuhiko Okano <okano(dot)katsuhiko(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: LWLock statistics collector (was: CSStorm occurred again by postgreSQL8.2)
Date: 2006-07-31 08:52:31
Message-ID: 200607311752.DFD60964.TuLUILPPBBPLVJO@oss.ntt.co.jp (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-patches
Hi,All.

Since the cause was found and the provisional patch was made 
and solved about the CSStorm problem in previous mails, it reports.

> Subject: [HACKERS] poor performance with Context Switch Storm at TPC-W.
> Date: Tue, 11 Jul 2006 20:09:24 +0900
> From: Katsuhiko Okano <okano(dot)katsuhiko(at)oss(dot)ntt(dot)co(dot)jp>
>
> poor performance with Context Switch Storm occurred
> with the following composition.


Premise knowledge :
PostgreSQL8.0 to SAVEPOINT was supported.
All the transactions have one or more subtransactions in an inside.
When judging VISIBILITY of a tupple, XID which inserted the tupple
 needs to judge a top transaction or a subtransaction.
(if it's XMIN committed)
In order to judge, it is necessary to access SubTrans.
(data structure which manages the parents of transaction ID)
SubTrans is accessed via a LRU buffer.


Occurrence conditions of this phenomenon :
The occurrence conditions of this phenomenon are the following.
- There is transaction which refers to the tupple in quantity frequency (typically  seq scan).
- (Appropriate frequency) There is updating transaction.
- (Appropriate length) There is long live transaction.


Point of view :
(A) The algorithm which replaces a buffer is bad.
A time stamp does not become new until swapout completes 
the swapout page.
If access is during swap at other pages, the swapout page will be 
in the state where it is not used most,
It is again chosen as the page for swapout.
(When work load is high)

(B) Accessing at every judgment of VISIBILITY of a tupple is frequent.
If many processes wait LWLock using semop, CSStorm will occur.


Result :
As opposed to (A),
I created a patch which the page of read/write IN PROGRESS does not 
make an exchange candidate.
(It has "betterslot" supposing the case where all the pages are set 
to IN PROGRESS.)
The patch was applied.
However, it recurred. it did not become fundamental solution.

As opposed to (B),
A patch which is changed so that it may consider that all the 
transactions are top transactions was created.
(Thank you, ITAGAKI) The patch was applied. 8 hours was measured.
CSStorm problem was stopped.


Argument :
(1)Since neither SAVEPOINT nor the error trap using PL/pgSQL is done, 
the subtransaction is unnecessary.
Is it better to implement the mode not using a subtransaction?

(2)It is the better if a cache can be carried out by structure 
like CLOG that it seems that it is not necessary to check 
a LRU buffer at every occasion.


Are there a problem and other ideas?
--------
Katsuhiko Okano
okano katsuhiko _at_ oss ntt co jp

In response to

Responses

pgsql-hackers by date

Next:From: Katsuhiko OkanoDate: 2006-07-31 09:00:17
Subject: Re: LWLock statistics collector (was: CSStorm occurred again by postgreSQL8.2)
Previous:From: Tatsuo IshiiDate: 2006-07-31 05:11:39
Subject: Re: contrib/pgbench bugfix

pgsql-patches by date

Next:From: Katsuhiko OkanoDate: 2006-07-31 09:00:17
Subject: Re: LWLock statistics collector (was: CSStorm occurred again by postgreSQL8.2)
Previous:From: Tatsuo IshiiDate: 2006-07-31 05:11:39
Subject: Re: contrib/pgbench bugfix

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group