Re: proposal - queryid can be used as filter for auto_explain

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: zsolt(dot)parragi(at)percona(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: proposal - queryid can be used as filter for auto_explain
Date: 2026-07-01 19:51:30
Message-ID: CAFj8pRBMvvkrCrsSxedhEejhMf=a1fuXuwohu3XVxwi3a+k2kw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

st 1. 7. 2026 v 4:30 odesílatel Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
napsal:

> Hello, I have a few comments on this.
>
> + /*
> + * Inform the postmaster that we want to enable query_id
> calculation if
> + * compute_query_id is set to auto.
> + */
> + EnableQueryId();
>
> This comment is not entirely accurate because when the module is
> loaded using LOAD, the setting is enabled only for the current
> session, not in the postmaster.
>

This comment was taken from pg_stat_statements that cannot be loaded.
I changed the comment to /* If compute_query_id = 'auto', we would like
query IDs. */
This sentence is used by pg_stash_advice.c

>
> Separately from that, I don't think auto_explain should enable query
> ID calculation when it is not actually going to use it. Instead, I
> wonder whether this should be done in assign_log_queryids(). This
> would naturally tie enabling query ID calculation to the state of the
> GUC, including configuration reloads, as well as session-local use via
> LOAD, while avoiding enabling it in sessions that never use
> auto_explain.
>

This is a hard question - I am not sure if one GUC can force a second GUC.
GUC are transactional and stacked, and afraid about cleaning in this case.
I think so any
feature can depend on more GUC, but in this case, the dependency is on
external modules, and then this can be hard to implement without some new
hook.

In this case I prefer simplicity - it is the same way as
pg_stat_statements. It doesn't
need queryid by default, and still it enables queryid by default. I think
it is most
user friendly for users that try to use queryid_filter. At the end, almost
all users
uses pg_stat_statements - so queryid will be enabled, and just for special
cases,
looks to me better to enable queryid explicitly in auto_explain too. It can
be disabled
any time explicitly - and then it is clear, so queryid_filter cannot work.
So I understand to
your comment, but choosed behave looks like most robust - and other ways
like
more fragile

Please, check Jakub Wartak comment - I am afraid to raise log when queryid
is not calculated and queryid_filter is not empty. I am afraid to force
queryid calculation from assigning_queryids_filter. So the implemented
design is the last way. Instead forcing queryid calculation in
queryids_filter assignment is better generally not forcing queryid
calculation.

>
>
> + if (queryId != INT64CONST(0))
> + {
> + int i;
> +
> ...
> + for (i = 0; i < queryId_filter->nqueryids; i++)
> + {
>
> I think it would be better to write this as "for (int i = ...".
>
>
changed

> - if (msec >= auto_explain_log_min_duration)
> + if (log_filter && msec >= auto_explain_log_min_duration)
>
> I think the overall decision logic for deciding whether to log a plan
> could be reorganized to avoid unnecessary query ID lookups.
>

when queryids_filter will be empty, then log_filter will be true.

Anytime, the user can set compute_queryid to OFF. My idea is - it should
work by default, and when the user doesn't want it, then he can explicitly
disable it.

>
>
> + /*
> + * We expect small number of watched queryids, and then
> + * a linear seaching is the fastest. As an alternative
> + * we can sort the array of queryId, and we can search
> + * there by bisection.
> + */
>
> I also think a linear search is fine here. If this ever turns out to
> be insufficient, I expect we'd simply switch to a more appropriate
> algorithm, regardless of whether the alternatives are mentioned here
> or not. Also, "is the fastest" seems stronger than necessary. I think
> we usually write comments like:
>
> > We expect only a few watched query IDs, so a linear search is sufficient.
>
> changed

>
> + if (log_filter && msec >= auto_explain_log_min_duration)
>
> This part made me pause because the relationship between log_filter
> and log_min_duration is not immediately obvious.
>
> Initially, I assumed they were independent options, so I expected an
> OR relationship. However, the implementation uses AND. If users also
> assume these are independent settings, the behavior may be
> surprising. Perhaps it would make sense to introduce a separate
> log_queryid_min_duration parameter if these are intended to be
> independent controls.
>

I thought about this possibility. But I think thinking about log_queryid
and log_min_duration as independent constraints can be confusing too
(and less safe). The log_queryid in this case is stronger filter.

At this moment I don't want to introduce a second new GUC, and without
GUC (like you proposed), the implemented design looks less confusing
and more practical. The possibility of having two (three) variants of
log_min_duration looks strange to me.

I like to think log_queryid is a second constraint, not a new feature.

>
> + /*
> + * queryid is 64bit integer. strtol fails on 32bit or on MSWin.
> + * In other places, queryid passed (from, to SQL) as generic int8,
> + * so we can use int8in routine. int8in can raise an exception,
> + * so pg_strtoint64_safe is used instead.
> + */
>
> I don't think this comment is necessary. Using pg_strtoint64_safe()
> already makes the intent clear.
>

removed

>
> I also think it would be good to add a test for the LOAD case.
>

What do you want to test in this case?

Thank you for check and comments

modified patch attached

Pavel

>
> Regards,
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>

Attachment Content-Type Size
v20260701-4-0001-auto_explain.log_queryids.patch text/x-patch 9.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2026-07-01 20:14:27 Re: Correct documentation for protocol version
Previous Message Erik Wienhold 2026-07-01 19:36:23 psql: Fix \df tab completion for procedures