Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Atsushi Torikoshi <atorik(at)gmail(dot)com>, Tatsuro Yamada <tatsuro(dot)yamada(dot)tf(at)nttcom(dot)co(dot)jp>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Evgeny Efimkin <efimkin(at)yandex-team(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Date: 2020-10-14 12:25:00
Message-ID: CAOBaU_Yf7t9QVLK4+mf1Jk0RD+8CMKAOeRaBBcigfjSxwfcptQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 14, 2020 at 5:43 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> On Tue, Oct 13, 2020 at 4:53 AM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> >
> > On Mon, Oct 12, 2020 at 04:07:30PM -0400, Tom Lane wrote:
> > > Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > > > On Mon, Oct 12, 2020 at 02:26:15PM -0400, Tom Lane wrote:
> > > >> Yeah, I agree --- a version number is the wrong way to think about this.
> > >
> > > > The version number was to invalidate _all_ query hashes if the
> > > > algorithm is slightly modified, rather than invalidating just some of
> > > > them, which could lead to confusion.
> > >
> > > Color me skeptical as to the use-case for that. From users' standpoints,
> > > the hash is mainly going to change when we change the set of parse node
> > > fields that get hashed. Which is going to happen at every major release
> > > and no (or at least epsilon) minor releases. So I do not see a point in
> > > tracking an algorithm version number as such. Seems like make-work.
> >
> > OK, I came up with the hash idea only to address one of your concerns
> > about mismatched hashes for algorithm improvements/changes. Seems we
> > might as well just document that cross-version hashes are different.
>
> Ok, so I tried to implement what seems to be the consensus. First
> attached patch moves the current pgss queryid computation in core,
> with a new compute_queryid GUC (on/off). One thing I don't really
> like about this patch is that the JumbleState that pgss needs in order
> to normalize the query string (the constants location and such) has to
> be done by the core while computing the queryid and provided to pgss
> in post_parse_analyse hook. That isn't ideal as it looks very
> specific to pgss needs. On the other hand it means that you can now
> use pgss with custom queryid heuristics by disabling compute_queryid
> and having your module doing only that in post_parse_analyse_hook.
> You'll however need to be careful to configure
> shared_preload_libraries such that your custom module's
> post_parse_analyse_hook is called first, so pgss' one can be called
> with the needed JumbleState. Note that if no JumbleState is provided
> pgss will store non normalized queries, but will otherwise behave as
> intended.
>
> The 2nd patch is the rebased original queryid exposure patch. No big
> changes, except that it now handles utility statements queryid
> generated during post_parse_analysis, same as regular queries. This
> should simplify the work needed for custom queryid third party
> modules.
>
> The 3rd patch changes explain (verbose) to display the queryid if one
> has been generated, whether by core or a third-party module. For
> instance:
>
> rjuju=# set compute_queryid = on;
> SET
> rjuju=# explain (verbose) select relname from pg_class;
> QUERY PLAN
> -----------------------------------------------------------------------
> Seq Scan on pg_catalog.pg_class (cost=0.00..16.90 rows=390 width=64)
> Output: relname
> Query Identifier: -5494854185674379299
> (3 rows)

There was a possibly uninitialized var issue in the previous patches
(thanks cfbot), v13 fixes that.

Attachment Content-Type Size
v13-0002-Expose-queryid-in-pg_stat_activity-and-log_line_.patch text/x-patch 34.6 KB
v13-0001-Move-pg_stat_statements-query-jumbling-to-core.patch text/x-patch 57.4 KB
v13-0003-Expose-query-identifier-in-verbose-explain.patch text/x-patch 3.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ajin Cherian 2020-10-14 12:44:05 Re: [HACKERS] logical decoding of two-phase transactions
Previous Message Andy Fan 2020-10-14 12:02:29 RelationGetNumberOfBlocks is called every time of heap_rescan.