Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements

From: Andrei Zubkov <zubkov(at)moonset(dot)ru>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Greg Stark <stark(at)mit(dot)edu>, Andres Freund <andres(at)anarazel(dot)de>, Julien Rouhaud <rjuju123(at)gmail(dot)com>
Subject: Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements
Date: 2023-01-18 19:04:56
Message-ID: d80d62f28597fff78321350b1d9f38b2ed54c52b.camel@moonset.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tomas,

On Wed, 2023-01-18 at 17:29 +0100, Tomas Vondra wrote:
> I took a quick look at this patch, to see if there's something we
> want/can get into v16. The last version was submitted about 9 months
> ago, and it doesn't apply cleanly anymore, but the bitrot is fairly
> minor. Not sure there's still interest, though.

Thank you for your attention to this patch!

I'm still very interest in this patch. And I think I'll try to rebase
this patch during a week or two if it seems possible to get it in 16..
>
> I'd probably call it "created_at" or something like
> that, but that's a minor detail. Or maybe stats_reset, which is what
> we
> use in pgstat?

Yes there is some naming issue. My thought was the following:
- "stats_reset" is not quite correct here, because the statement entry
moment if definitely not a reset. The field named just as it means -
this is time of the moment from which statistics is collected for this
particular entry.
- "created_at" perfectly matches the purpose of the field, but seems
not such self-explaining to me.

>
> But is the second timestamp for the min/max fields really useful?
> AFAIK
> to perform analysis, people take regular pg_stat_statements
> snapshots,
> which works fine for counters (calculating deltas) but not for gauges
> (which need a reset, to track fresh values). But people analyzing
> this
> are already resetting the whole entry, and so the snapshots already
> are
> tracking deltas.
>
> So I'm not convinced actually need the second timestamp.

The main purpose of the patch is to provide means to collecting
solutions to avoid the reset of pgss at all. Just like it happens for
the pg_stat_ views. The only really need of reset is that we can't be
sure that observing statement was not evicted and come back since last
sample. Right now we only can do a whole reset on each sample and see
how many entries will be in pgss hashtable on the next sample - how
close this value to the max. If there is a plenty space in hashtable we
can hope that there was not evictions since last sample. However there
could be reset performed by someone else and we are know nothing about
this.
Having a timestamp in stats_since field we are sure about how long this
statement statistics is tracked. That said sampling solution can
totally avoid pgss resets. Avoiding such resets means avoiding
interference between monitoring solutions.
But if no more resets is done we can't track min/max values, because
they still needs a reset and we can do nothing with such resets - they
are necessary. However I still want to know when min/max reset was
performed. This will help to detect possible interference on such
resets.
>
>
> A couple more comments:
>
> 1) I'm not sure why the patch is adding tests of permissions on the
> pg_stat_statements_reset function?
>
> 2) If we want the second timestamp, shouldn't it also cover resets of
> the mean values, not just min/max?

I think that mean values shouldn't be target for a partial reset
because the value for mean values can be easily reconstructed by the
sampling solution without a reset.

>
> 3) I don't understand why the patch is adding "IS NOT NULL AS t" to
> various places in the regression tests.

The most of tests was copied from the previous version. I'll recheck
them.

>
> 4) I rather dislike the "minmax" naming, because that's often used in
> other contexts (for BRIN indexes), and as I mentioned maybe it should
> also cover the "mean" fields.

Agreed, but I couldn't make it better. Other versions seemed worse to
me...
>
>
Regards, Andrei Zubkov

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message gkokolatos 2023-01-18 19:05:43 Re: Add LZ4 compression in pg_dump
Previous Message Robert Haas 2023-01-18 19:02:05 Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation