Skip site navigation (1) Skip section navigation (2)

Re: RFC: Timing Events

From: Satoshi Nagayasu <snaga(at)uptime(dot)jp>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: RFC: Timing Events
Date: 2012-11-04 09:28:11
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
(2012/11/03 10:44), Josh Berkus wrote:
>> I don't see all that going into core without a much bigger push than I
>> think people will buy.  What people really want for all these is a
>> proper trending system, and that means graphs and dashboards and
>> bling--not a history table.
> Well, I'm particularly thinking for autoconfiguration.  For example, to
> set vacuum_freeze_min_age properly, you have to know the XID "burn rate"
> of the server, which is only available via history.  I really don't want
> to be depending on a graphical monitoring utility to find these things out.
>> This whole approach has the assumption that things are going to fall off
>> sometimes.  To expand on that theme for a second, right now I'm more
>> worried about the "99%" class of problems.  Neither pg_stat_statements
>> nor this idea are very good for tracking the rare rogue problem down.
>> They're both aimed to make things that happen a lot more statistically
>> likely to be seen, by giving an easier UI to glare at them frequently.
>> That's not ideal, but I suspect really fleshing the whole queue consumer
>> -> table idea needs to happen to do much better.
> I'm just concerned that for some types of incidents, it would be much
> more than 1% *of what you want to look at* which fall off.  For example,
> consider a server which does 95% reads at a very high rate, but has 2%
> of its writes cronically having lock waits.  That's something you want
> to solve, but it seems fairly probably that these relatively infrequent
> queries would have fallen off the bottom of pg_stat_statements.  Same
> thing with the relative handful of queries which do large on-disk sorts.
> The problem I'm worried about is that pg_stat_statements is designed to
> keep the most frequent queries, but sometimes the thing you really need
> to look at is not in the list of most frequent queries.

I think auto_explain would help you solve such rare incidents
if it could dump several statistics into server log, including lock
waits and block reads/writes statistic per-session, for example.

Do we have something to add to auto_explain?

Satoshi Nagayasu <snaga(at)uptime(dot)jp>
Uptime Technologies, LLC.

In response to


pgsql-hackers by date

Next:From: Pavel StehuleDate: 2012-11-04 09:35:45
Subject: Re: RFC: Timing Events
Previous:From: Craig RingerDate: 2012-11-04 08:39:16
Subject: Re: Unresolved error 0xC0000409 on Windows Server

Privacy Policy | About PostgreSQL
Copyright © 1996-2018 The PostgreSQL Global Development Group