Quick Links

Re: measuring lwlock-related latency spikes

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Greg Stark <stark(at)mit(dot)edu>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: measuring lwlock-related latency spikes
Date:	2012-04-02 07:33:47
Message-ID:	CA+U5nMJaDfFRHmYew0iMVQ-PU5LBts95EbV=dNX+TY9rdQjO4Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Apr 2, 2012 at 12:00 AM, Greg Stark <stark(at)mit(dot)edu> wrote:
> On Sun, Apr 1, 2012 at 4:05 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> My guess based on previous testing is
>> that what's happening here is (1) we examine a tuple on an old page
>> and decide we must look up its XID, (2) the relevant CLOG page isn't
>> in cache so we decide to read it, but (3) the page we decide to evict
>> happens to be dirty, so we have to write it first.
>
> Reading the code one possibility is that in the time we write the
> oldest slru page another process has come along and redirtied it. So
> we pick a new oldest slru page and write that. By the time we've
> written it another process could have redirtied it again. On a loaded
> system where the writes are taking 100ms or more it's conceivable --
> barely -- that could happen over and over again hundreds of times.

That's a valid concern but I don't think the instrumentation would
show that as a single long wait because the locks would be released
and be retaken each time around the loop - I guess that's for Robert
to explain how it would show up.

If it doesn't show it, then the actual max wait time could be even higher. ;-(

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Re: measuring lwlock-related latency spikes at 2012-04-01 23:00:52 from Greg Stark

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2012-04-02 07:36:14	Re: Event scheduling
Previous Message	Simon Riggs	2012-04-02 07:15:20	Re: measuring lwlock-related latency spikes