Re: refreshing query id for pg_stat_statements based on comment in sql

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Zhihong Yu <zyu(at)yugabyte(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: refreshing query id for pg_stat_statements based on comment in sql
Date: 2022-03-08 01:42:33
Message-ID: 20220308014233.lqxevlklbzdhyx2r@jrouhaud
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, Mar 07, 2022 at 09:42:26AM -0800, Zhihong Yu wrote:
> Hi,
> Currently the query id for pg_stat_statements gets calculated based on the
> parse nodes specifics.
> This means that the user cannot add a comment to a SQL query to test
> something. (though some other RDBMS allows this practice).
>
> Consider this use case: for query q, admin looks at stats and performs some
> optimization (without changing the query). Admin adds / modifies the
> comment for q - now the query becomes q'. If query id doesn't change, there
> still would be one row in pg_stat_statements which makes it difficult to
> gauge the effectiveness of the tuning.
>
> I want to get opinion from the community whether adding / changing comment
> in SQL query should result in new query id for pg_stat_statements.

Are you talking about optimizer hint with something like pg_hint_plan, or just
random comment like "/* we now added index blabla */ SELECT ..."?

If the former, then such an extension can already provide its own queryid
generator which can chose to ignore part or all of the comments or not.

If the latter, then it seems shortsighted to me. At the very least not all
application can be modified to have a specific comment attached to a query.

Also, if you want check how a query if performing after doing some
modifications, you should start with some EXPLAIN ANALYZE first (or even a
simple EXPLAIN if you want to validate some new index using hypothetical
indexes). If this is some more general change (e.g. shared_buffers,
work_mem...) then the whole system is going to perform differently, and you
certainly won't add a new comment to every single query executed.

So again it seems to me that doing pg_stat_statement snapshots and comparing
the diff between each to see how the whole workload, or specific queries, is
behaving is still the best answer here.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-03-08 02:37:49 Re: Time to drop plpython2?
Previous Message Michael Paquier 2022-03-08 01:28:46 Re: pg_tablespace_location() failure with allow_in_place_tablespaces