Re: Obtaining resource usage statistics from execution? (v 9.1)

From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Karl Denninger" <karl(at)denninger(dot)net>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Obtaining resource usage statistics from execution? (v 9.1)
Date: 2012-03-16 15:08:25
Message-ID: 9a66ff3fb0978ac1dfc418e1aa8afd7c.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

first of all, which PostgreSQL version are you using, what platform is it
running on? What level of control do you have over the database (are you
just a user or can you modify the postgresql.conf file)?

On 16 Březen 2012, 15:31, Karl Denninger wrote:
> Hi folks;
>
> I am trying to continue profiling which in turn feeds query and index
> tuning changes for the AKCS-WWW forum software, and appear to have no
> good way to do what I need to do -- or I've missed something obvious.

Why do you need to do that? Have you checked log_duration /
log_min_duration_statement configuration options? What about auto_explain
and maybe pg_stat_statements?

The aggregated data (e.g. provided by pg_stat_statements or pgfounie) are
IMHO much more useful than having to deal with data collected for each
query separately.

> The application uses the libpq interface from "C" to talk to Postgres
> which contains all the back end data. Since this is a forum application
> it is very read-heavy (other than accounting and of course user posting
> functionality), and is template-driven. All of the table lookup
> functions that come from the display templates are compartmentalized in
> one function in the base code.
>
> What I want to be able to do is to determine the resource usage by
> Postgres for each of these calls.
>
> I can do this by adding a call into the function just before the "real"
> call to PQexec() that prepends "explain analyze" to the call, makes a
> preamble call to PQexec() then grabs the last tuple returned which is
> the total execution time (with some text), parse that and there is the
> total time anyway. But I see no way to get stats on I/O (e.g. Postgres
> buffer hits and misses, calls to the I/O operating system level APIs,
> etc.)
>
> But while I can get the numbers this way it comes at the expense of
> doubling the Postgres processing. There does not appear, however, to be
> any exposition of the processing time requirements for actual (as
> opposed to "modeled" via explain analyze) execution of queries -- at
> least not via the libpq interface.

Yup, that's the problem of EXPLAIN ANALYZE. IMHO it's a 'no go' in this
case I guess. Not only you have to run the query twice, but it may also
significantly influence the actual runtime due to gettimeofday overhead
etc.

You can use auto_explain to eliminate the need to run the query twice, but
the overhead may still be a significant drag, not reflecting the actual
performance (and thus not useful to perform reasonable profiling).

> Am I missing something here -- is there a way to get resource
> consumption from actual queries as they're run? What I'm doing right
> now is the above, with a configuration switch that has a minimum
> reportable execution time and then logging the returns that exceed that
> time, logging the queries that have the above-threshold runtimes for
> analysis and attempted optimization. This works but obviously is
> something one only does for profiling as it doubles database load and is
> undesirable in ordinary operation. What I'd like to be able to do is
> have the code track performance all the time and raise alerts when it
> sees "outliers" giving me a continually-improving set of targets for
> reduction of resource consumption (up until I reach the point where I
> don't seem to be able to make it any faster of course :-))

If all you want is outliers, then set log_min_duration_statement and use
pgfounie to process the logs. That's very simple and very effective way to
deal with them.

If you really need the resource consumption stats, you may write a simple
SRF that calls getrusage and returns the data as a row so that you'll be
able to do something like

select * from pg_rusage()

This seems like a neat idea, and writing an extension that should be
fairly simple. Still, it will be a Linux-only (because getrusage is) and
I'm not quite sure the collected data are very useful.

Tomas

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Karl Denninger 2012-03-16 15:38:32 Re: Obtaining resource usage statistics from execution? (v 9.1)
Previous Message Tom Lane 2012-03-16 14:53:45 Re: Obtaining resource usage statistics from execution? (v 9.1)