From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Greg Stark <stark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: measuring lwlock-related latency spikes |
Date: | 2012-03-31 08:53:51 |
Message-ID: | CA+U5nMJBsGKKo6VmKqKy4gzHkU-XcGs6n_vt7t094xJCucLHZA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Mar 31, 2012 at 4:41 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> which means, if I'm not
> confused here, that every single lwlock-related stall > 1s happened
> while waiting for a buffer content lock. Moreover, each event
> affected a different buffer. I find this result so surprising that I
> have a hard time believing that I haven't screwed something up, so if
> anybody can check over the patch and this analysis and suggest what
> that thing might be, I would appreciate it.
Possible candidates are
1) pages on the RHS of the PK index on accounts. When the page splits
a new buffer will be allocated and the contention will move to the new
buffer. Given so few stalls, I'd say this was the block one above leaf
level.
2) Buffer writes hold the content lock in shared mode, so a delayed
I/O during checkpoint on a page requested by another for write would
show up as a wait for a content lock. That might happen to updates
where checkpoint write occurs between the search and write portions of
the update.
The next logical step in measuring lock waits is to track the reason
for the lock wait, not just the lock wait itself.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Dobes Vandermeer | 2012-03-31 10:27:26 | Http Frontend implemented using pgsql? |
Previous Message | Dean Rasheed | 2012-03-31 08:28:38 | Tab completion of double quoted identifiers broken |