Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements

From: "Anton A(dot) Melnikov" <aamelnikov(at)inbox(dot)ru>
To: Andrei Zubkov <zubkov(at)moonset(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements
Date: 2021-10-18 19:11:12
Message-ID: 62d16845-e74e-a6f9-9661-022e44f48922@inbox.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 07.10.2021 15:31, Andrei Zubkov wrote:
> There is an issue with this patch. It's main purpose is the ability to
> calculate values of pg_stat_statements view
>  [...]
> Does addition of resettable min/max metrics to the
> pg_stat_statemets view seems reasonable here?

Hello, Andrey!

I think it depends on what the slow top level sampler wants.
Let define the current values in pg_stat_statements for some query as gmin and gmax.
It seems to be a two main variants:
1) If top level sampler wants to know for some query the min and max values for
the entire watch time, then existing gmin and gmax in pg_stat_statements are sufficient.
The top level sampler can clarify top_min and top_max at every slow sample as follows:
top_max = gmax > top_max ? gmax : top_max;
top_min = gmin < top_min ? gmin : top_min;
This should work regardless of whether there was a reset between samples or not.
2) The second case happens when the top level sampler wants to know the min and max
values for sampling period.
In that case i think one shouldn't not use gmin and gmax and especially reset
them asynchronously from outside because its lifetime and sampling period are
independent values and moreover someone else might need gmin and gmax as is.
So i suppose that additional vars loc_min and loc_max is a right way to do it.
If that, at every top sample one need to replace not clarify
the new top values as follows:
top_max = loc_max; loc_max = 0;
top_min = loc_min; loc_min = INT_MAX;
And one more thing, if there was a reset of stats between two samples,
then i think it is the best to ignore the new values,
since they were obtained for an incomplete period.
This patch, thanks to the saved time stamp, makes possible
to determine the presence of reset between samples and
there should not be a problem to realize such algorithm.

The patch is now applied normally, all my tests were successful.
The only thing I could suggest to your notice
is a small cosmetic edit to replace
the numeric value in #define on line 1429 of pg_stat_statements.c
by one of the constants defined above.

Best regards,
Anton Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gilles Darold 2021-10-18 20:15:23 Re: [PATCH] Proposal for HIDDEN/INVISIBLE column
Previous Message Tomas Vondra 2021-10-18 19:02:56 Re: XTS cipher mode for cluster file encryption