Buffer usage in EXPLAIN and pg_stat_statements

From: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Buffer usage in EXPLAIN and pg_stat_statements
Date: 2009-08-17 07:45:03
Message-ID: 20090817155032.9A28.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here is a proposal to add buffer usage information to EXPLAIN and
contrib/pg_stat_statements. We can retrieve new values 'gets',
'reads' and 'temp':

- gets : total number of buffer pool access
- reads : total number of data file access
- temp : total number of temp file access (sort)

In EXPLAIN, we can use "EXPLAIN (ANALYZE, BUFFER) ..." syntax.
Each executor node shows buffer usage only in it; parent nodes
don't contain buffer usages in their sub nodes.

In pg_stat_statements, new 3 columns are added to the view.
We can determine queries that consume I/O bandwidth using
"SELECT * FROM pg_stat_statements ORDER BY reads DESC".
We will find out bad queries easily using those buffer and
disk access information in addition to duration statistics.

I implementd this feature using an instrumentation stack. A global
variable CurrentInstrument points top of the stack, and each
Instrumentation are linked with newly added 'prev' field.
The stack must be reset even on error because each innstrumentation
might have been deallocated already. I added codes to reset stack
in main loop of backend for the purpose.

TopInstrument is a special node that sums up all of the child nodes.
It tracks QueryDesc.totaltime and used in pg_stat_statements. There
might be another idea that walking around on planstate tree and
gathering all counters in the module, but very complex codes are
needed. I chose a simple way.

I'll write documentations if this design is accepted.
Comments welcome.

Output samples are below:

[EXPLAIN]
=# EXPLAIN (ANALYZE, BUFFER) SELECT * FROM pgbench_accounts ORDER BY bid;
QUERY PLAN
----------------------------------------------------------
Sort (cost=...) (actual ...) (gets=0 reads=0 temp=1309)
Sort Key: bid
Sort Method: external sort Disk: 10472kB
-> Seq Scan on pgbench_accounts (cost=...) (actual ...) (gets=1798 reads=1644 temp=0)
Total runtime: 75.867 ms

[contrib/pg_stat_statements]
=# SELECT query, gets, reads FROM pg_stat_statements ORDER BY gets DESC LIMIT 4;
query | gets | reads
----------------------------------------------------------------------+-------+-------
UPDATE pgbench_branches SET bbalance = bbalance + $1 WHERE bid = $2; | 58628 | 1
UPDATE pgbench_accounts SET abalance = abalance + $1 WHERE aid = $2; | 26999 | 1929
UPDATE pgbench_tellers SET tbalance = tbalance + $1 WHERE tid = $2; | 25474 | 1
SELECT abalance FROM pgbench_accounts WHERE aid = $1; | 19950 | 0
(4 rows)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
buffer_usage-20090817.patch application/octet-stream 18.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2009-08-17 08:19:01 Re: hot standby - merged up to CVS HEAD
Previous Message Michael Meskes 2009-08-17 07:40:05 Re: Split-up ECPG patches