Re: Sampling profiler updated

From: Stefan Moeding <pgsql(at)moeding(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sampling profiler updated
Date: 2009-07-15 19:39:51
Message-ID: 87bpnln3aw.fsf@esprit.moeding.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

Thanks for your answer. Here is my general reasoning: I was thinking
about a way to use the profiler to determine the resource profile even
of (maybe even short time) business transactions. I would like to leave
the technical focus (high CPU usage, high I/O rate, too many disk sorts,
...) to a more customer/business centric approach (loading the customer
form takes too long).

My vision would be to get a profile for just one session and only for
the time it takes to load the form. Using the profile for the whole
database would hide the exact details when you have other users doing
other things. And oviously you need to do it on the production machine
during business hours or you would ignore most of the influencing
factors.

The resource profile for the observed business transaction tells us
where the time is actually spent. Applying Amdahl's Law also tells us
what improvements we can expect from certain changes. Let's asume that
30% of the time is CPU time and the business requests the transaction
duration to be cut down to half of the current duration. Without the
profile you could only guess that a CPU upgrade might help. With the
profile you can prove that even doubling the CPU speed will only get you
a 15% improvement.

The advantage is in my opinion that it will not only show you the most
beneficial approach (the resource that contributes most to the total
time) but also can use business related terms (improve online form X or
batch job Y) together with specific improvements that can be expected.

Itagaki Takahiro writes:

> I think per-backend profiling is confusable because one backend might be
> reused by multiple jobs if you use connection pooling. If we need more
> detailed profiling, it should be grouped by query variations.

I see your point. I must say that I haven't thought of pooling probably
because I don't use it. But it is easier to build views around the data
with aggregations to hide the details than trying to come up with
details when only averages are available.

> I didn't know that. What is the point of the works? Are there some
> knowledge we should learn from?

I tried to outline most of his message in the first paragraphs above.
His Method-R to response time based approach seems to me like a good
improvement as it is measurable in business terms, allows a prediction
before you change of buy something and gives a deterministic and
repeatable way to tackle the root cause for the current performance
shortcoming.

--
Stefan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dimitri Fontaine 2009-07-15 19:52:58 Re: more than one index in a single heap pass?
Previous Message Mike Wilson 2009-07-15 19:36:47 Re: changing enumlabel from a NameData to text