Re: Summary function for pg_buffercache

From: Andres Freund <andres(at)anarazel(dot)de>
To: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
Cc: Aleksander Alekseev <aleksander(at)timescale(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Summary function for pg_buffercache
Date: 2022-09-22 16:10:14
Message-ID: 20220922161014.copbzwdl3ja4nt6z@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-09-22 18:22:44 +0300, Melih Mutlu wrote:
> Since header locks are removed again, I put some doc changes and comments
> back.

Due to the merge of the meson build system, this needs to adjust meson.build
as well.

> --- a/contrib/pg_buffercache/expected/pg_buffercache.out
> +++ b/contrib/pg_buffercache/expected/pg_buffercache.out
> @@ -8,3 +8,12 @@ from pg_buffercache;
> t
> (1 row)
>
> +select buffers_used + buffers_unused > 0,
> + buffers_dirty < buffers_used,
> + buffers_pinned < buffers_used

Doesn't these have to be "<=" instead of "<"?

> + for (int i = 0; i < NBuffers; i++)
> + {
> + BufferDesc *bufHdr;
> + uint32 buf_state;
> +
> + /*
> + * No need to get locks on buffer headers as we don't rely on the
> + * results in detail. Therefore, we don't get a consistent snapshot
> + * across all buffers and it is not guaranteed that the information of
> + * each buffer is self-consistent as opposed to pg_buffercache_pages.
> + */

I think the "consistent snapshot" bit is misleading - even taking buffer
header locks wouldn't give you that.

> + if (buffers_used != 0)
> + usagecount_avg = usagecount_avg / buffers_used;

Perhaps the average should be NULL in the buffers_used == 0 case?

> + <para>
> + <function>pg_buffercache_pages</function> function
> + returns a set of records, plus a view <structname>pg_buffercache</structname> that wraps the function for
> + convenient use is provided.
> + </para>
> +
> + <para>
> + <function>pg_buffercache_summary</function> function returns a table with a single row
> + that contains summarized and aggregated information about shared buffer caches.
> </para>

I think these sentences are missing a "The " at the start?

"shared buffer caches" isn't right - I think I'd just drop the "caches".

> + <para>
> + There is a single row to show summarized information of all shared buffers.
> + <function>pg_buffercache_summary</function> is not interested
> + in the state of each shared buffer, only shows aggregated information.
> + </para>
> +
> + <para>
> + <function>pg_buffercache_summary</function> doesn't take buffer manager
> + locks. Unlike <function>pg_buffercache_pages</function> function
> + <function>pg_buffercache_summary</function> doesn't take buffer headers locks
> + either, thus the result is not consistent. This is intentional. The purpose
> + of this function is to provide a general idea about the state of shared
> + buffers as fast as possible. Additionally, <function>pg_buffercache_summary</function>
> + allocates much less memory.
> + </para>
> + </sect2>

I don't think this mentioning of buffer header locks is useful for users - nor
is it I think correct. Acquiring the buffer header locks wouldn't add *any*
additional consistency.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Önder Kalacı 2022-09-22 16:13:50 Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Previous Message Andres Freund 2022-09-22 15:42:25 Re: libpq error message refactoring