Re: pg_stat_statements

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_stat_statements
Date: 2008-06-24 07:07:50
Message-ID: 1214291270.9468.809.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Tue, 2008-06-24 at 12:14 +0900, ITAGAKI Takahiro wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> > I think you want to see the distribution of execution times for
> > particular queries without needing to log *every* execution, including
> > parameters. I think I understand now what you are asking for and why you
> > are asking for it.
>
> Yes. In many case, major influential queries don't depend on thier
> parameters. Also, averages of execution time are useful if you want to
> log indivisual queries that takes long time compared to the averages.
> For example, you can use configuration like:
> "If an execution time is 3 times longer than the average of the same plan,
> write the explain-tree into log".

Yes, thats good.

> That reminds me, it might have association with Auto-explain patch.
> Is it given up?
> http://archives.postgresql.org/pgsql-patches/2008-01/msg00123.php

Not by me. If the author doesn't finish it someone else will.

> > We could also have a function that causes each backend to dump the
> > current averages of all plans through to the stats collector, so you can
> > assemble a global view. But that should be on-demand, not a continuous
> > flow of stats, IMHO.
>
> Hmm, it's interesting idea. It doesn't require any additional shared
> memory and max length of stat message would be enough for typical
> queries (PGSTAT_MSG_PAYLOAD is 1000 bytes or less).

There is a related issue here. Sending the text of every query to the
stat collector is an overhead for the same reason sending the text to
the log is an overhead. The solution should be the same also: don't send
the text repeatedly.

The reason we don't increase PGSTAT_MSG_PAYLOAD is that we don't want to
experience more overhead. But if we work out a way to reduce the
frequency of messages, we might be able to increase PGSTAT_MSG_PAYLOAD,
so we see the whole text when we want it, but not send it when nobody is
looking. Now that's a gorgias solution.

For protocol 3 we should be sending an id, not the whole text each time
we execute. But that won't solve the problem for simple queries.

So pg_stat_get_activity() should poll backends for the SQL of the
currently executing query, not be continually sent.

> I'm not sure
> how to *notify* all backends to send plans to collector, but we could
> send plans periodically instead. Bloated pgstat.stat file might be
> another problem, and I have no idea about it.

Code is already there, and as of a few days ago, unused.
PMSIGNAL_WAKEN_CHILDREN

> I think the stats collector way is not more than the shared memory way in
> efficiency, but it is more flexible. I'll reconsider and compare them...

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2008-06-24 07:52:22 Re: Git Repository for WITH RECURSIVE and others
Previous Message Shane Ambler 2008-06-24 06:40:09 Re: Dept of ugly hacks: eliminating padding space in system indexes