Re: Summary function for pg_buffercache

From: Andres Freund <andres(at)anarazel(dot)de>
To: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
Cc: Zhang Mingli <zmlpostgres(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Summary function for pg_buffercache
Date: 2022-10-12 19:27:54
Message-ID: 20221012192754.25tuutl6sgeod2w3@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-09-28 18:19:57 +0300, Melih Mutlu wrote:
> diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
> new file mode 100644
> index 0000000000..77e250b430
> --- /dev/null
> +++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
> @@ -0,0 +1,13 @@
> +/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
> +
> +-- complain if script is sourced in psql, rather than via ALTER EXTENSION
> +\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this file. \quit
> +
> +CREATE FUNCTION pg_buffercache_summary()
> +RETURNS TABLE (buffers_used int4, buffers_unused int4, buffers_dirty int4,
> + buffers_pinned int4, usagecount_avg real)
> +AS 'MODULE_PATHNAME', 'pg_buffercache_summary'
> +LANGUAGE C PARALLEL SAFE;

I think using RETURNS TABLE isn't quite right here, as it implies 'SETOF'. But
the function doesn't return a set of rows. I changed this to use OUT
parameters.

> +-- Don't want these to be available to public.
> +REVOKE ALL ON FUNCTION pg_buffercache_summary() FROM PUBLIC;

I think this needs to grant to pg_monitor too. See
pg_buffercache--1.2--1.3.sql

I added a test verifying the permissions are right, with the hope that it'll
make future contributors try to add a parallel test and notice the permissions
aren't right.

> + /* Construct a tuple descriptor for the result rows. */
> + tupledesc = CreateTemplateTupleDesc(NUM_BUFFERCACHE_SUMMARY_ELEM);

Given that we define the return type on the SQL level, it imo is nicer to use
get_call_result_type() here.

> + TupleDescInitEntry(tupledesc, (AttrNumber) 5, "usagecount_avg",
> + FLOAT4OID, -1, 0);

I changed this to FLOAT8. Not that the precision will commonly be useful, but
it doesn't seem worth having to even think about whether there are cases where
it'd matter.

I also changed it so that the accumulation happens in an int64 variable named
usagecount_total, which gets converted to a double only when actually
computing the result.

> <para>
> The <filename>pg_buffercache</filename> module provides a means for
> - examining what's happening in the shared buffer cache in real time.
> + examining what's happening in the shared buffer in real time.
> </para>

This seems to be an unnecessary / unrelated change. I suspect you made it in
response to
https://postgr.es/m/20220922161014.copbzwdl3ja4nt6z%40awork3.anarazel.de
but that was about a different sentence, where you said 'shared buffer caches'
(even though there is only a single shared buffer cache).

> <indexterm>
> @@ -17,10 +17,19 @@
> </indexterm>
>
> <para>
> - The module provides a C function <function>pg_buffercache_pages</function>
> - that returns a set of records, plus a view
> - <structname>pg_buffercache</structname> that wraps the function for
> - convenient use.
> + The module provides C functions <function>pg_buffercache_pages</function>
> + and <function>pg_buffercache_summary</function>.
> + </para>
> +
> + <para>
> + The <function>pg_buffercache_pages</function> function
> + returns a set of records, plus a view <structname>pg_buffercache</structname> that wraps the function for
> + convenient use is provided.
> + </para>

I rephrased this, because it sounds like the function returns a set of records
and a view.

> + <para>
> + The <function>pg_buffercache_summary</function> function returns a table with a single row
> + that contains summarized and aggregated information about shared buffer.
> </para>

"summarized and aggregated" is quite redundant.

> + <table id="pgbuffercachesummary-columns">
> + <title><structname>pg_buffercachesummary</structname> Columns</title>

Missing underscore.

> + <row>
> + <entry role="catalog_table_entry"><para role="column_definition">
> + <structfield>buffers_unused</structfield> <type>int4</type>
> + </para>
> + <para>
> + Number of shared buffers that not currently being used
> + </para></entry>
> + </row>

There's a missing 'are' in here, I think. I rephrased all of these to
"Number of (used|unused|dirty|pinned) shared buffers"

> + <row>
> + <entry role="catalog_table_entry"><para role="column_definition">
> + <structfield>buffers_dirty</structfield> <type>int4</type>
> + </para>
> + <para>
> + Number of dirty shared buffers
> + </para></entry>
> + </row>
> +
> + <row>
> + <entry role="catalog_table_entry"><para role="column_definition">
> + <structfield>buffers_pinned</structfield> <type>int4</type>
> + </para>
> + <para>
> + Number of shared buffers that has a pinned backend
> + </para></entry>
> + </row>

Backends pin buffers, not the other way round...

> + <para>
> + There is a single row to show summarized information of all shared buffers.
> + <function>pg_buffercache_summary</function> is not interested
> + in the state of each shared buffer, only shows aggregated information.
> + </para>
> +
> + <para>
> + The <function>pg_buffercache_summary</function> doesn't provide a result
> + that is consistent across all buffers. This is intentional. The purpose
> + of this function is to provide a general idea about the state of shared
> + buffers as fast as possible. Additionally, <function>pg_buffercache_summary</function>
> + allocates much less memory.
> + </para>

I still didn't like this comment. Please see the attached.

I intentionally put my changes into a fixup commit, in case you want to look
at the differences.

Greetings,

Andres Freund

Attachment Content-Type Size
v15-0001-pg_buffercache-Add-pg_buffercache_summary.patch text/x-diff 11.9 KB
v15-0002-fixup-pg_buffercache-Add-pg_buffercache_summary.patch text/x-diff 10.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2022-10-12 19:35:31 how to correctly react on exception in pfree function?
Previous Message Peter Eisentraut 2022-10-12 19:19:17 Re: Simplifying our Trap/Assert infrastructure