Re: Vacuum rate limit in KBps

From: Jim Nasby <jim(at)nasby(dot)net>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Vacuum rate limit in KBps
Date: 2012-01-21 22:54:07
Message-ID: 3CCA38FD-46EC-4B54-97CE-26A1CCBC2AA3@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jan 19, 2012, at 4:23 PM, Greg Smith wrote:
> On 1/18/12 4:18 PM, Jim Nasby wrote:
>> What about doing away with all the arbitrary numbers completely, and just state data rate limits for hit/miss/dirty?
>
> Since many workloads will have a mix of all three, it still seems like there's some need for weighing these individually, even if they each got their own rates. If someone says read=8MB/s and write=4MB/s (the current effective defaults), I doubt they would be happy with seeing 12MB/s happen.
>
>> BTW, this is a case where it would be damn handy to know if the miss was really a miss or not... in the case where we're already rate limiting vacuum, could we afford the cost of get_time_of_day() to see if a miss actually did have to come from disk?
>
> We certainly might if it's a system where timing information is reasonably cheap, and measuring that exact area will be easy if the timing test contrib module submitted into this CF gets committed. I could see using that to re-classify some misses as hits if the read returns fast enough.
>
> There's not an obvious way to draw that line though. The "fast=hit" vs. "slow=miss" transition happens at very different place on SSD vs. regular disks, as the simplest example. I don't see any way to wander down this path that doesn't end up introducing multiple new GUCs, which is the opposite of what I'd hoped to do--which was at worst to keep the same number, but reduce how many were likely to be touched.

Your two comments together made me realize something... at the end of the day people don't care about MB/s. They care about impact to other read and write activity in the database.

What would be interesting is if we could monitor how long all *foreground* IO requests took. If they start exceeding some number, that means the system is at or near full capacity, and we'd like background stuff to slow down.

Dealing with SSDs vs real media would be a bit challenging... though, I think it would only be an issue if the two were randomly mixed together. Kept separately I would expect them to have distinct behavior patterns that could be measured and identified.
--
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2012-01-21 22:58:28 Re: Autonomous subtransactions
Previous Message Jim Nasby 2012-01-21 22:29:58 Re: Removing freelist (was Re: Should I implement DROP INDEX CONCURRENTLY?)