Re: Summary function for pg_buffercache

From: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
To: Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>
Subject: Re: Summary function for pg_buffercache
Date: 2022-09-10 12:55:30
Message-ID: CAGPVpCSh6FCO6=gKtzQQx+fw=ayB0CNk7Et5vbt3yXs5hZ_fUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Aleksander,

> I'm not sure about what undefined behaviour could harm this badly.
>
> You are right that in practice nothing wrong will (probably) happen on
> x86/x64 architecture with (most?) modern C compilers. This is not true in
> the general case though. It's up to the compiler to decide how reading the
> bufHdr->tag is going to be actually implemented. This can be one assembly
> instruction or several instructions. This reading can be optimized-out if
> the compiler believes the required value is already in the register, etc.
> Since the result will be different depending on the assembly code used this
> is an undefined behaviour and we can't use code like this.
>

Got it. Thanks for explaining.

> However I have somewhat mixed feelings about avg_usagecount. Generally
> AVG() is a relatively useless methric for monitoring. What if the user
> wants MIN(), MAX() or let's say a 99th percentile? I suggest splitting it
> into usagecount_min, usagecount_max and usagecount_sum. AVG() can be
> derived as usercount_sum / used_buffers.
>

Won't be usagecount_max almost always 5 as "BM_MAX_USAGE_COUNT" set to 5 in
buf_internals.h? I'm not sure about how much usagecount_min would add
either.
A usagecount is always an integer between 0 and 5, it's not
something unbounded. I think the 99th percentile would be much better than
average if strong outlier values could occur. But in this case, I feel like
an average value would be sufficiently useful as well.
usagecount_sum would actually be useful since average can be derived from
it. If you think that the sum of usagecounts has a meaning just by itself,
it makes sense to include it. Otherwise, wouldn't showing directly averaged
value be more useful?

> Also I suggest changing the names of the columns in order to make them
> consistent with the rest of the system. If you consider pg_stat_activity
> and family [1] you will notice that the columns are named
> (entity)_(property), e.g. backend_xid, backend_type, client_addr, etc. So
> instead of used_buffers and unused_buffers the naming should be
> buffers_used and buffers_unused.
>
> [1]: https://www.postgresql.org/docs/current/monitoring-stats.html
>

You're right. I will change the names accordingly. Thanks.

Regards,
Melih

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Treat 2022-09-10 13:07:29 Re: small windows psqlrc re-wording
Previous Message Ranier Vilela 2022-09-10 10:58:23 Re: Fix typo function circle_same (src/backend/utils/adt/geo_ops.c)