Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

From: Julien Rouhaud <julien(dot)rouhaud(at)free(dot)fr>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Evgeny Efimkin <efimkin(at)yandex-team(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Date: 2019-09-11 16:30:22
Message-ID: CAOBaU_aQKkB3HehOp820ssJXBQbGiAQjxvB_8ynNG2g4MbUUjA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for looking at it!

On Wed, Sep 11, 2019 at 6:45 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> An invalid query ID is assumed to be 0 in the patch, per the way it is
> defined in pg_stat_statements. However this also maps with the case
> where we have a utility statement.

Oh indeed. Which means that if a utility statements later calls
parse_analyze or friends, this patch would report an unexpected
queryid. That's at least possible for something like

COPY (SELECT * FROM tbl) TO ...

The thing is that pg_stat_statements assigns a 0 queryid in the
post_parse_analyze_hook to recognize utility statements and avoid
tracking instrumentation twice in case of utility statements, and then
compute a queryid base on a hash of the query text. Maybe we could
instead fully reserve queryid "2" for utility statements (so forcing
queryid "1" for standard queries if jumbling returns 0 *or* 2 instead
of only 0), and use "2" as the identifier for utility statement
instead of "0"?

> + /*
> + * We only report the top-level query identifiers. The stored queryid is
> + * reset when a backend call pgstat_report_activity(STATE_RUNNING), or with
> + * an explicit call to this function. If the saved query identifier is not
> + * zero it means that it's not a top-level command, so ignore the one
> + * provided unless it's an explicit call to reset the identifier.
> + */
> + if (queryId != 0 && beentry->st_queryid != 0)
> + return;
> Hmm. I am wondering if we shouldn't have an API dedicated to the
> reset of the query ID. That logic looks rather brittle..

How about adding a "bool force" parameter to allow resetting the queryid to 0?

> Wouldn't it be better (and more consistent) to update the query ID in
> parse_analyze_varparams() and parse_analyze() as well after going
> through the post_parse_analyze hook instead of pg_analyze_and_rewrite?

I thought about it without knowing what would be best. I'll change to
report the queryid right after calling post_parse_analyze_hook then.

> + /*
> + * If a new query is started, we reset the query identifier as it'll only
> + * be known after parse analysis, to avoid reporting last query's
> + * identifier.
> + */
> + if (state == STATE_RUNNING)
> + beentry->st_queryid = 0
> I don't quite get why you don't reset the counter in other cases as
> well. If the backend entry is idle in transaction or in an idle
> state, it seems to me that we should not report the query ID of the
> last query run in the transaction. And that would make the reset in
> exec_simple_query() unnecessary, no?

I'm reproducing the same behavior as for the query text, ie. showing
the information about the last executed query text if state is idle:

+ <entry><structfield>queryid</structfield></entry>
+ <entry><type>bigint</type></entry>
+ <entry>Identifier of this backend's most recent query. If
+ <structfield>state</structfield> is <literal>active</literal> this field
+ shows the identifier of the currently executing query. In all other
+ states, it shows the identifier of last query that was executed.

I think that showing the last executed query's queryid is as useful as
the query text. Also, while avoiding a reset in exec_simple_query()
it'd be required to do such reset in case of error during query
execution, so that wouldn't make things quite simpler..

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera from 2ndQuadrant 2019-09-11 17:44:23 Re: propagating replica identity to partitions
Previous Message Michael Meskes 2019-09-11 16:04:03 Re: A problem presentaion about ECPG, DECLARE STATEMENT