Re: EXPLAIN ANALYZE printing logical and hardware I/O per-node

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Decibel!" <decibel(at)decibel(dot)org>
Cc: "Neil Conway" <neilc(at)samurai(dot)com>, "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: EXPLAIN ANALYZE printing logical and hardware I/O per-node
Date: 2007-12-19 02:08:38
Message-ID: 877ijbjv15.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


"Decibel!" <decibel(at)decibel(dot)org> writes:

> When a read() call returns, surely the kernel knows whether it actually issued
> a physical read request to satisfy that. I don't see any reason why you
> couldn't have a version of read() that returns that information. I also rather
> doubt that we're the only userland software that would make use of that.

I'm told this exists on Windows for the async interface. But AFAIK it doesn't
on Unix. The visibility into things like this is what makes DTrace so
remarkable.

I think there aren't many userland software interested in this. The only two
cases I can think of are databases -- which use direct I/O partly because of
this issue -- and real-time software like multimedia software -- which use
aren't so much interested in measuring it as forcing things to be preloaded
with stuff like posix_fadvise() or mlock().

I don't think DTrace is overkill either. The programmatic interface is
undocumented (but I've gotten Sun people to admit it exists -- I just have to
reverse engineer it from the existing code samples) but should be more or less
exactly what we need.

But the lowest-common-denominator of just timing read() and seeing if it took
long enough to involve either a context switch or sleeping on physical i/o
should be a pretty close approximation. The case where it would be least
accurate is when most or all of the data is actually in the cache. Then even
with a low false-positive rate detecting cache misses it'll still dominate the
true near-zero rate of cache misses.

We could mitigate that somewhat by describing it in the plan as something like

................... (... I/O fast=nnn slow=nnn)

instead of the more descriptive "physical" and "logical"

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2007-12-19 02:09:10 Re: autoconf trouble in the CVS HEAD
Previous Message KaiGai Kohei 2007-12-19 01:46:43 Re: autoconf trouble in the CVS HEAD