Re: LWLock statistics collector (was: CSStorm occurred again by postgreSQL8.2)

From: Katsuhiko Okano <okano(dot)katsuhiko(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: LWLock statistics collector (was: CSStorm occurred again by postgreSQL8.2)
Date: 2006-07-31 09:00:17
Message-ID: 200607311800.BHI65621.LBUBLIuVOPPJLTP@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Katsuhiko Okano wrote:
> Since the cause was found and the provisional patch was made
> and solved about the CSStorm problem in previous mails, it reports.
(snip)
> (A) The algorithm which replaces a buffer is bad.
> A time stamp does not become new until swapout completes
> the swapout page.
> If access is during swap at other pages, the swapout page will be
> in the state where it is not used most,
> It is again chosen as the page for swapout.
> (When work load is high)

The following is the patch.

diff -cpr postgresql-8.1.4-orig/src/backend/access/transam/slru.c postgresql-8.1.4-SlruSelectLRUPage-fix/src/backend/access/transam/slru.c

*** postgresql-8.1.4-orig/src/backend/access/transam/slru.c 2006-01-21 13:38:27.000000000 +0900

--- postgresql-8.1.4-SlruSelectLRUPage-fix/src/backend/access/transam/slru.c 2006-07-25 18:02:49.000000000 +0900

*************** SlruSelectLRUPage(SlruCtl ctl, int pagen

*** 703,710 ****

for (;;)

{

int slotno;

! int bestslot = 0;

unsigned int bestcount = 0;

/* See if page already has a buffer assigned */

for (slotno = 0; slotno < NUM_SLRU_BUFFERS; slotno++)

--- 703,712 ----

for (;;)

{

int slotno;

! int bestslot = -1;

! int betterslot = -1;

unsigned int bestcount = 0;

+ unsigned int bettercount = 0;

/* See if page already has a buffer assigned */

for (slotno = 0; slotno < NUM_SLRU_BUFFERS; slotno++)

*************** SlruSelectLRUPage(SlruCtl ctl, int pagen

*** 720,732 ****

*/

for (slotno = 0; slotno < NUM_SLRU_BUFFERS; slotno++)

{

! if (shared->page_status[slotno] == SLRU_PAGE_EMPTY)

! return slotno;

! if (shared->page_lru_count[slotno] > bestcount &&

! shared->page_number[slotno] != shared->latest_page_number)

! {

! bestslot = slotno;

! bestcount = shared->page_lru_count[slotno];

}

}

--- 722,746 ----

*/

for (slotno = 0; slotno < NUM_SLRU_BUFFERS; slotno++)

{

! switch (shared->page_status[slotno])

! {

! case SLRU_PAGE_EMPTY:

! return slotno;

! case SLRU_PAGE_READ_IN_PROGRESS:

! case SLRU_PAGE_WRITE_IN_PROGRESS:

! if (shared->page_lru_count[slotno] > bettercount &&

! shared->page_number[slotno] != shared->latest_page_number)

! {

! betterslot = slotno;

! bettercount = shared->page_lru_count[slotno];

! }

! default: /* SLRU_PAGE_CLEAN,SLRU_PAGE_DIRTY */

! if (shared->page_lru_count[slotno] > bestcount &&

! shared->page_number[slotno] != shared->latest_page_number)

! {

! bestslot = slotno;

! bestcount = shared->page_lru_count[slotno];

! }

}

}

*************** SlruSelectLRUPage(SlruCtl ctl, int pagen

*** 736,741 ****

--- 750,758 ----

if (shared->page_status[bestslot] == SLRU_PAGE_CLEAN)

return bestslot;

+ if (bestslot == -1)

+ bestslot = betterslot;

+

/*

* We need to do I/O. Normal case is that we have to write it out,

* but it's possible in the worst case to have selected a read-busy

Regards,
--------
Katsuhiko Okano
okano katsuhiko _at_ oss ntt co jp

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Meskes 2006-07-31 10:15:31 pgsql: First small patches needed for regression tests
Previous Message Katsuhiko Okano 2006-07-31 08:52:31 Re: LWLock statistics collector (was: CSStorm occurred again by postgreSQL8.2)

Browse pgsql-patches by date

  From Date Subject
Next Message Peter Eisentraut 2006-07-31 15:26:10 Re: [PATCHES] extension for sql update
Previous Message Katsuhiko Okano 2006-07-31 08:52:31 Re: LWLock statistics collector (was: CSStorm occurred again by postgreSQL8.2)