Re: OS cached buffers (was: Support Parallel Query Execution

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Luke Lonergan <llonergan(at)greenplum(dot)com>, Skype Technologies OY <hannu(at)skype(dot)net>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Myron Scott <lister(at)sacadia(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: OS cached buffers (was: Support Parallel Query Execution
Date: 2006-04-14 04:45:05
Message-ID: 20060414044505.GW49405@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Thu, Apr 13, 2006 at 03:38:04PM -0400, Bruce Momjian wrote:
> > Is there any pratcical way to tell the difference between a page comming
> > from the OS cache and one comming from disk? Or maybe for a set of pages
> > an estimate on how many came from cache vs disk? There's some areas
> > where having this information would be very useful, such as for vacuum
> > delay. It would make tuning much easier, and it would also give us some
> > insight on how heavily loaded disks were, which would also be useful
> > info for vacuum to have (so we could adjust vacuum_cost_delay
> > dynamically based on load).
>
> getrusage() returns:
>
> ! 0.000062 elapsed 0.000000 user 0.000062 system sec
> ! [0.000000 user 0.009859 sys total]
> ! 0/0 [19/2] filesystem blocks in/out
> ! 0/0 [0/0] page faults/reclaims, 0 [0] swaps
> ! 0 [0] signals rcvd, 0/0 [4/5] messages rcvd/sent
> ! 0/0 [23/6] voluntary/involuntary context switches
>
> but I don't see anything in there that would show kernel cache vs. disk
> I/O. In fact, there is usually little connection in the kernel between
> an I/O request and the process that requests it.

Yeah, my assumption has been that the only way to tell the difference
would be by timing, but I don't know how practical that is. Since
gettime() or whatever EXPLAIN ANALYZE uses is apparently very expensive,
perhaps there's some other alternative. Perhapse the timing info in
getrusage would work for this. Another idea is setting an alarm for a
fairly short period before making the IO request. If the request comes
back before the alarm fires, the data must have been in the OS cache.

Another thought is that any IO request that goes to disk would most
likely put the process requesting the IO to sleep, but a request being
served out of cache might not do that. Perhaps there's some way to
recognize that.

Or maybe a better track would be to develop a patch for as many OSes as
possible that would tell the caller if an IO request came out of cache
or not.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zeugswetter Andreas DCP SD 2006-04-14 08:05:47 Re: Practical impediment to supporting multiple SSL libraries
Previous Message Jim C. Nasby 2006-04-14 04:18:35 Re: Control File

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2006-04-14 13:46:57 Re: OS cached buffers (was: Support Parallel Query Execution
Previous Message Bruce Momjian 2006-04-13 19:38:04 Re: OS cached buffers (was: Support Parallel Query Execution