{PROPOSAL] add session information column to pg_stat_statements

From: Sergei Agalakov <sergei(dot)agalakov(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: {PROPOSAL] add session information column to pg_stat_statements
Date: 2018-12-01 22:13:01
Message-ID: 3aa097d7-7c47-187b-5913-db8366cd4491@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I have renamed the thread [PROPOSAL] extend the object names to the
qualified names in pg_stat_statements
started on
https://www.postgresql.org/message-id/9baf5c06-d6ab-c688-010c-843348e3d98c%40gmail.com
and ended on
https://www.postgresql.org/message-id/c93bb5ce-22bd-eb6b-a057-d0666585258f%40gmail.com

Currently pg_stat_statements doesn't have enough information to distinct
queries executed in the different environment;
for example the queries with the same text from pg_stat_statements.query
column can be the different queries if they were executed with
the different search path parameter.
The initial proposed solution was to extend names of the objects in the
query to the qualified names, so for the text
"select * from t1"
from the pg_stat_statements.query column the new proposed column
pg_stat_statements.query_qn would have the text
"select * from s1.t1"
Based on the discussion this solution has proved to be
a) relatively difficult to implement and slow to execute
b) resolves only the missed schema name problem but other differences in
the execution environment would require some new extra columns

So the new proposed change addresses these concerns by been
a) faster (probably)
b) extensible
and, of course, it is backward compatible with the existing
pg_stat_statements view.

We can add a column pg_stat_statements.session_info jsonb.
Its content can be defined by the new configuration parameter
pg_stat_statements.session_info ('current_schemas, current_user,
session_user') // a subset of the data from the system information functions
or in the initial implementation it can be hardcoded to include at least
current_schemas and current user.

and it will have data like
{
     "current_schemas" : ["pg_catalog", "s1", "s2", "public"],
     "current_user" : "user1",
     "session_user" : "user1"
}

It will allow the DBA/developer to understand and reproduce a
performance issue, and will allow the deeper level of granularity for
the reporting tools.

I don't know how difficult will be to implement something like that.

Thank you,

Sergei

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2018-12-01 22:25:08 on or true
Previous Message Andres Freund 2018-12-01 21:20:56 Re: Bug fix for glibc broke freebsd build in REL_11_STABLE