Re: RFC: Logging plan of the running query

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-12 13:21:26
Message-ID: 2d6d7846b6b794c0e727c431852a113c@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2023-10-11 16:22, Ashutosh Bapat wrote:

> Like many others I think this feature is useful to debug a long running
> query.
>
> Sorry for jumping late into this.
>
> I have a few of high level comments

Thanks for your comments!

> There is a lot of similarity between what this feature does and what
> auto explain does. I see the code is also duplicated. There is some
> merit in avoiding this duplication
> 1. we will get all the features of auto_explain automatically like
> choosing a format (this was expressed somebody earlier in this
> thread), setings etc.
> 2. avoid bugs. E.g your code switches context after ExplainState has
> been allocated. These states may leak depending upon when this
> function gets called.
> 3. Building features on top as James envisions will be easier.
>
> Considering the similarity with auto_explain I wondered whether this
> function should be part of auto_explain contrib module itself? If we
> do that users will need to load auto_explain extension and thus
> install executor hooks when this function doesn't need those. So may
> not be such a good idea. I didn't see any discussion on this.

I once thought about adding this to auto_explain, but I left it asis for
below reasons:

- One of the typical use case of pg_log_query_plan() would be analyzing
slow query on customer environments. On such environments, We cannot
always control what extensions to install.
Of course auto_explain is a major extension and it is quite possible
that they installed auto_explain, but but it is also possible they do
not.
- It seems a bit counter-intuitive that pg_log_query_plan() is in an
extension called auto_explain, since it `manually`` logs plans

> I tried following query to pass PID of a non-client backend to this
> function.
> #select pg_log_query_plan(pid), application_name, backend_type from
> pg_stat_activity where backend_type = 'autovacuum launcher';
> pg_log_query_plan | application_name | backend_type
> -------------------+------------------+---------------------
> t | | autovacuum launcher
> (1 row)
> I see "LOG: backend with PID 2733631 is not running a query or a
> subtransaction is aborted" in server logs. That's ok. But may be we
> should not send signal to these kinds of backends at all, thus
> avoiding some system calls.

Agreed, it seems better.
Attached patch checks if the backendType of target process is 'client
backend'.

=# select pg_log_query_plan(pid), application_name, backend_type from
pg_stat_activity where backend_type = 'autovacuum launcher';
WARNING: PID 63323 is not a PostgreSQL client backend process
pg_log_query_plan | application_name | backend_type
-------------------+------------------+---------------------
f | | autovacuum launcher

> I am also wondering whether it's better to report the WARNING as
> status column in the output. E.g. instead of
> #select pg_log_query_plan(100);
> WARNING: PID 100 is not a PostgreSQL backend process
> pg_log_query_plan
> -------------------
> f
> (1 row)
> we output
> #select pg_log_query_plan(100);
> pg_log_query_plan | status
> -------------------+---------------------------------------------
> f | PID 100 is not a PostgreSQL backend process
> (1 row)
>
> That looks neater and can easily be handled by scripts, applications
> and such. But it will be inconsistent with other functions like
> pg_terminate_backend() and pg_log_backend_memory_contexts().

It seems neater, but it might be inconvenient because we can no longer
use it in select list like the following query as you wrote:

#select pg_log_query_plan(pid), application_name, backend_type from
pg_stat_activity where backend_type = 'autovacuum launcher';

> I do share a concern that was discussed earlier. If a query is running
> longer, there's something problematic with it. A diagnostic
> intervention breaking it further would be unwelcome. James has run
> experiments to shake this code for any loose breakages. He has not
> found any. So may be we are good. And we wouldn't know about very rare
> corner cases so easily without using it in the field. So fine with it.
> If we could add some safety net that will be great but may not be
> necessary for the first cut.

If there are candidates for the safety net, I'm willing to add them.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Attachment Content-Type Size
v31-0001-Add-function-to-log-the-plan-of-the-query.patch text/x-diff 27.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2023-10-12 13:36:19 Re: Lowering the default wal_blocksize to 4K
Previous Message Dilip Kumar 2023-10-12 12:46:05 Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock