Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Atsushi Torikoshi <atorik(at)gmail(dot)com>, Tatsuro Yamada <tatsuro(dot)yamada(dot)tf(at)nttcom(dot)co(dot)jp>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Evgeny Efimkin <efimkin(at)yandex-team(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Date: 2020-10-18 08:12:48
Message-ID: CAOBaU_bcsqYWzyABkh=i8X96DBychnVxQBvFo9wow-afMuECCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 18, 2020 at 12:20 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> > Wait ... what? I've been thinking that this GUC is just to enable or
> > disable the computation of query ID, not to change the algorithm to do
> > so. Do we really need to allow different algorithms in different
> > sessions?
>
> We established that some time ago, no?

I thought we established the need for allowing different algorithms,
but I assumed globally not per session. Anyway, allowing to enable or
disable compute_queryid per session would technically allow that,
assuming that you have another module loaded that computes a queryid
only if no-one was already computed. In that case pg_stat_statements
works as you would expect, you will get a new entry, with a duplicated
query text.

With a bit more thinking, there's at least one use case where it's
interesting to disable pg_stat_statements: queries using temporary
tables. In that case you're guaranteed to generate an infinity of
different queryid. That doesn't really help since you're not
aggregating anything anymore, and it also makes pg_stat_statements
virtually unusable as once you have a workload that needs frequent
eviction, the overhead is so bad that you basically have to disable
pg_stat_statements. We could alternatively add a GUC to disable
queryid computation when one of the tables is a temporary table, but
that's yet one among many considerations that are probably best
answered with a custom implementation.

I'm also attaching an updated patch with some attempt to improve the
documentation. I mention that in-core algorithm may not suits
everyone's needs, but we don't actually document what heuristics are.
Should we give more details on them and what are the most direct
consequences?

Attachment Content-Type Size
v14-0002-Expose-queryid-in-pg_stat_activity-and-log_line_.patch text/x-patch 36.2 KB
v14-0003-Expose-query-identifier-in-verbose-explain.patch text/x-patch 5.6 KB
v14-0001-Move-pg_stat_statements-query-jumbling-to-core.patch text/x-patch 58.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2020-10-18 11:37:44 Re: [PATCH] Add extra statistics to explain for Nested Loop
Previous Message Tom Lane 2020-10-18 04:20:37 Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?