Re: Summary function for pg_buffercache

From: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
To: Aleksander Alekseev <aleksander(at)timescale(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Summary function for pg_buffercache
Date: 2022-09-20 08:47:40
Message-ID: CAGPVpCRtjkm9jVAq6ND3NPTBAvs6mLYUu9fTm6ZgNh0FXNBU=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Also I suggest changing the names of the columns in order to make them
> consistent with the rest of the system. If you consider pg_stat_activity
> and family [1] you will notice that the columns are named
> (entity)_(property), e.g. backend_xid, backend_type, client_addr, etc. So
> instead of used_buffers and unused_buffers the naming should be
> buffers_used and buffers_unused.
>
> [1]: https://www.postgresql.org/docs/current/monitoring-stats.html

I changed these names and updated the patch.

However I have somewhat mixed feelings about avg_usagecount. Generally
>> AVG() is a relatively useless methric for monitoring. What if the user
>> wants MIN(), MAX() or let's say a 99th percentile? I suggest splitting it
>> into usagecount_min, usagecount_max and usagecount_sum. AVG() can be
>> derived as usercount_sum / used_buffers.
>>
>
> Won't be usagecount_max almost always 5 as "BM_MAX_USAGE_COUNT" set to 5
> in buf_internals.h? I'm not sure about how much usagecount_min would add
> either.
> A usagecount is always an integer between 0 and 5, it's not
> something unbounded. I think the 99th percentile would be much better than
> average if strong outlier values could occur. But in this case, I feel like
> an average value would be sufficiently useful as well.
> usagecount_sum would actually be useful since average can be derived from
> it. If you think that the sum of usagecounts has a meaning just by itself,
> it makes sense to include it. Otherwise, wouldn't showing directly averaged
> value be more useful?
>

Aleksander, do you still think the average usagecount is a bit useless? Or
does it make sense to you to keep it like this?

> I suggest we focus on saving the memory first and then think about the
> > performance, if necessary.
>
> Personally I think the locks part is at least as important - it's what
> makes
> the production impact higher.
>

I agree that it's important due to its high impact. I'm not sure how to
avoid any undefined behaviour without locks though.
Even with locks, performance is much better. But is it good enough for
production?

Thanks,
Melih

Attachment Content-Type Size
v6-0001-Added-pg_buffercache_summary-function.patch application/octet-stream 11.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2022-09-20 08:55:11 Re: About displaying NestLoopParam
Previous Message Aleksander Alekseev 2022-09-20 08:35:58 Re: Add common function ReplicationOriginName.