Re: pg_stat_statements oddity with track = all

From: Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Sergei Kornilov <sk(at)zsrv(dot)org>, legrand legrand <legrand_legrand(at)hotmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_stat_statements oddity with track = all
Date: 2021-01-19 08:55:18
Message-ID: 4dce029c72fc5eb084c5353f31daa13f@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for making the patch to add "toplevel" column in
pg_stat_statements.
This is a review comment.

There hasn't been any discussion, at least that I've been able to find.
So, +1 to change the status to "Ready for Committer".

1. submission/feature review
=============================

This patch can be applied cleanly to the current master branch(ed4367).
I tested with `make check-world` and I checked there is no fail.

This patch has reasonable documents and tests.
A "toplevel" column of pg_stat_statements view is documented and
following tests are added.
- pg_stat_statements.track option : 'top' and 'all'
- query type: normal query and nested query(pl/pgsql)

I tested the "update" command can work.
postgres=# ALTER EXTENSION pg_stat_statements UPDATE TO '1.10';

Although the "toplevel" column of all queries which already stored is
'false',
we have to decide the default. I think 'false' is ok.

2. usability review
====================

This patch solves the problem we can't get to know
which query is top-level if pg_stat_statements.track = 'all'.

This leads that we can analyze with aggregated queries.

There is some use-case.
For example, we can know the sum of total_exec_time of queries
even if nested queries are executed.

We can know how efficiently a database can use CPU resource for queries
using the sum of the total_exec_time, and the exec_user_time and
exec_system_time in pg_stat_kcache which is the extension gathering
os resources.

Although one concern is whether only top-level is enough or not,
I couldn't come up with any use-case to use nested level, so I think
it's ok.

3. coding review
=================

Although I had two concerns, I think they are no problem.
So, this patch looks good to me.

Following were my concerns.

A. the risk of too many same queries is duplicate.

Since this patch adds a "top" member in the hash key,
it leads to store duplicated same query which "top" is false and true.

This concern is already discussed and I agreed to the consensus
it seems unlikely to have the same queries being executed both
at the top level and as nested statements.

B. add a argument of the pg_stat_statements_reset().

Now, pg_stat_statements supports a flexible reset feature.
> pg_stat_statements_reset(userid Oid, dbid Oid, queryid bigint)

Although I wondered whether we need to add a top-level flag to the
arguments,
I couldn't come up with any use-case to reset only top-level queries or
not top-level queries.

4. others
==========

These comments are not related to this patch.

A. about the topic of commitfests

Since I think this feature is for monitoring,
it's better to change the topic from "System Administration"
to "Monitoring & Control".

B. check logic whether a query is top-level or not in pg_stat_kcache

This patch uses the only exec_nested_level to check whether a query is
top-level or not.
The reason is nested_level is less useful and I agree.

But, pg_stat_kcache uses plan_nested_level too.
Although the check result is the same, it's better to change it
corresponding to this patch after it's committed.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2021-01-19 09:01:48 Re: Single transaction in the tablesync worker?
Previous Message Kyotaro Horiguchi 2021-01-19 08:32:00 Re: Is it worth accepting multiple CRLs?