Re: Patch: add timing of buffer I/O requests

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch: add timing of buffer I/O requests
Date: 2011-11-29 01:59:02
Message-ID: 4ED43C66.6020108@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/28/2011 05:51 AM, Robert Haas wrote:
> On Mon, Nov 28, 2011 at 2:54 AM, Greg Smith<greg(at)2ndquadrant(dot)com> wrote:
>> The real problem with this whole area is that we know there are
>> systems floating around where the amount of time taken to grab timestamps
>> like this is just terrible.
> Assuming the feature is off by default (and I can't imagine we'd
> consider anything else), I don't see why that should be cause for
> concern. If the instrumentation creates too much system load, then
> don't use it: simple as that.

It's not quite that simple though. Releasing a performance measurement
feature that itself can perform terribly under undocumented conditions
has a wider downside than that.

Consider that people aren't going to turn it on until they are already
overloaded. If that has the potential to completely tank performance,
we better make sure that area is at least explored usefully first; the
minimum diligence should be to document that fact and make suggestions
for avoiding or testing it.

Instrumentation that can itself become a performance problem is an
advocacy problem waiting to happen. As I write this I'm picturing such
an encounter resulting in an angry blog post, about how this proves
PostgreSQL isn't usable for serious systems because someone sees massive
overhead turning this on. Right now the primary exposure to this class
of issue is EXPLAIN ANALYZE. When I was working on my book, I went out
of my way to find a worst case for that[1], and that turned out to be a
query that went from 7.994ms to 69.837ms when instrumented. I've been
meaning to investigate what was up there since finding that one. The
fact that we already have one such problem bit exposed already worries
me; I'd really prefer not to have two.

[1] (Dell Store 2 schema, query was "SELECT count(*) FROM customers;")

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-11-29 02:11:51 Re: Patch: add timing of buffer I/O requests
Previous Message Andres Freund 2011-11-29 01:52:37 Re: CommitFest 2011-11 Post-Tryptophan Progress Report