Summary function for pg_buffercache

From: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Summary function for pg_buffercache
Date: 2022-08-18 13:57:16
Message-ID: CAGPVpCQAXYo54Q=8gqBsS=u0uk9qhnnq4+710BtUhUisX1XGEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

Added a pg_buffercache_summary() function to retrieve an aggregated summary
information with less cost.

It's often useful to know only how many buffers are used, how many of them
are dirty etc. for monitoring purposes.
This info can already be retrieved by pg_buffercache. The extension
currently creates a row with many details for each buffer, then summary
info can be aggregated from that returned table.
But it is quite expensive to run regularly for monitoring.

The attached patch adds a pg_buffercache_summary() function to get this
summary info faster.
New function only collects following info and returns them in a single row:
- used_buffers = number of buffers with a valid relfilenode (both dirty and
not)
- unused_buffers = number of buffers with invalid relfilenode
- dirty_buffers = number of dirty buffers.
- pinned_buffers = number of buffers that have at least one pinning backend
(i.e. refcount > 0)
- average usagecount of used buffers

One other difference between pg_buffercache_summary and
pg_buffercache_pages is that pg_buffercache_summary does not get locks on
buffer headers as opposed to pg_buffercache_pages.
Since the purpose of pg_buffercache_summary is just to give us an overall
idea about shared buffers and to be a cheaper function, locks are not
strictly needed.

To compare pg_buffercache_summary() and pg_buffercache_pages(), I used a
simple query to aggregate the summary information above by calling
pg_buffercache_pages().
Here is the result:

postgres=# show shared_buffers;
shared_buffers
----------------
16GB
(1 row)

Time: 0.756 ms
postgres=# SELECT relfilenode <> 0 AS is_valid, isdirty, count(*) FROM
pg_buffercache GROUP BY relfilenode <> 0, isdirty;
is_valid | isdirty | count
----------+---------+---------
t | f | 209
| | 2096904
t | t | 39
(3 rows)

Time: 1434.870 ms (00:01.435)
postgres=# select * from pg_buffercache_summary();
used_buffers | unused_buffers | dirty_buffers | pinned_buffers |
avg_usagecount
--------------+----------------+---------------+----------------+----------------
248 | 2096904 | 39 | 0 |
3.141129
(1 row)

Time: 9.712 ms

There is a significant difference between timings of those two functions,
even though they return similar results.

I would appreciate any feedback/comment on this change.

Thanks,
Melih

Attachment Content-Type Size
0001-Added-pg_buffercache_summary-function.patch application/octet-stream 7.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2022-08-18 13:57:55 Re: Data caching
Previous Message Tom Lane 2022-08-18 13:53:48 Re: Strip -mmacosx-version-min options from plperl build