Re: compute_query_id and pg_stat_statements

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Christoph Berg <myon(at)debian(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: compute_query_id and pg_stat_statements
Date: 2021-05-13 04:03:42
Message-ID: 20210513040342.fouh7ryglrfgxglu@nol
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 12, 2021 at 11:33:32PM -0400, Bruce Momjian wrote:
> On Thu, May 13, 2021 at 11:16:13AM +0800, Julien Rouhaud wrote:
> > On Wed, May 12, 2021 at 11:06:52PM -0400, Bruce Momjian wrote:
> > > On Thu, May 13, 2021 at 09:57:00AM +0800, Julien Rouhaud wrote:
> > >
> > > > source? What if you have for instance pg_stat_statements, pg_stat_kcache,
> > > > pg_store_plans and pg_wait_sampling installed? All those extensions need a
> > > > query_id (or at least benefit from it for pg_wait_sampling), is there any value
> > > > to give a full list of all the modules that would enable compute_query_id?
> > >
> > > Well, we don't have any other cases where the source of the change is
> > > this indeterminate, so I don't really have a suggestion. I think this
> > > does highlight another case where 'auto' just isn't a good fit for our
> > > API.
> >
> > It may depends on your next question
> >
> > > > I'm assuming that anyone wanting to install any of those extensions (or any
> > > > similar one) is fully aware that they aggregate metrics based on at least a
> > > > query_id. If they don't, well they probably never read any documentation since
> > > > postgres 9.2 which introduced query normalization, and I doubt that they will
> > > > be interested in having access to the information anyway.
> > >
> > > My point is that we are changing a setting from an extension, and if you
> > > look in pg_setting, it will say "default"?
> >
> > No, it will say "on". What the patch I sent implements is:
>
> I was asking what pg_settings.source will say:
>
> SELECT name, soource FROM pg_settings;

Oh sorry. Here's the full output before and after a dynamic call to
queryIsWanted():

name | compute_query_id
setting | auto
unit | <NULL>
category | Statistics / Monitoring
short_desc | Compute query identifiers.
extra_desc | <NULL>
context | superuser
vartype | enum
source | default
min_val | <NULL>
max_val | <NULL>
enumvals | {auto,on,off}
boot_val | auto
reset_val | auto
sourcefile | <NULL>
sourceline | <NULL>
pending_restart | f

=# LOAD 'pg_qualstats'; /* will call queryIsWanted() */
WARNING: 01000: Without shared_preload_libraries, only current backend stats will be available.
LOAD

name | compute_query_id
setting | on
unit | <NULL>
category | Statistics / Monitoring
short_desc | Compute query identifiers.
extra_desc | <NULL>
context | superuser
vartype | enum
source | default
min_val | <NULL>
max_val | <NULL>
enumvals | {auto,on,off}
boot_val | auto
reset_val | auto
sourcefile | <NULL>
sourceline | <NULL>
pending_restart | f

So yes, it says "default" (and it's the same if the change is done during
postmaster init). It can be fixed if that's the only issue.

>
> > - compute_query_id = on means it was either explicitly set to on, or
> > automatically set to on if it was allowed to (so initially set to auto). It
> > means you know whether core query_id calculation is enabled or not, you can
> > know looking at the reset value it it was changed by an extension, you just
> > don't know which one(s).
> >
> > - compute_query_id = auto means that it can be set to on, it just wasn't yet,
> > so it's off, and may change if you have an extension that can be dynamically
> > loaded and request for core query_id calculation to be enabled
>
> So, it is 'off', but not set to 'off' in the GUC sense --- just off as
> in not being computed. You can see the confusion in my just reading
> that sentence.

It's technically not "off" but "not on yet", but that's probably just making it
worse :)

> How do they know to set shared_preload_libraries then? We change the
> user API all the time, especially in GUCs, and even rename them, but for
> some reason we don't think people using pg_stat_statements can be
> trusted to read the release notes and change their behavior. I just
> don't get it.

I don't know what to say. So here is a summary of the complaints that I'm
aware of:

- https://www.postgresql.org/message-id/1953aec168224b95b0c962a622bef0794da6ff40.camel@moonset.ru
That was only a couple of days after the commit just before the feature freeze,
so it may be the less relevant complaint.

- https://www.postgresql.org/message-id/CAOxo6XJEYunL71g0yD-zRzNRRqBG0Ssw-ARygy5pGRdSjK5YLQ%40mail.gmail.com
Did a git bisect to find the commit that changed the behavior and somehow
didn't notice the new setting

- this thread, with Fuji-san saying:

> I'm afraid that users may easily forget to enable compute_query_id when using
> pg_stat_statements (because this setting was not necessary in v13 or before)

- this thread, with Peter E. saying:

> Now there is the additional burden of turning on this weird setting that
> no one understands. That's a 50% increase in burden. And almost no one will
> want to use a nondefault setting. pg_stat_statements is pretty popular. I
> think leaving in this requirement will lead to widespread confusion and
> complaints.

- this thread, with Pavel saying:

> Until now, the pg_stat_statements was zero-config. So the change is not user
> friendly.

So it's a mix of "it's changing something that didn't change in a long time"
and "it's adding extra footgun and/or burden as it's not doing by default what
the majority of users will want", with an overwhelming majority of people
supporting the "we don't want that extra burden".

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2021-05-13 04:06:35 Re: alter subscription drop publication fixes
Previous Message Bruce Momjian 2021-05-13 03:33:32 Re: compute_query_id and pg_stat_statements