Re: RFC: Logging plan of the running query

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-01-27 11:18:37
Message-ID: d3313477-2d51-ea2d-b08c-7d62ee3e12b9@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2022/01/24 14:33, torikoshia wrote:
> As mentioned above, I updated the patch.

Thanks for updating the patch!

Here are another review comments:

+LOG: plan of the query running on backend with PID 17793 is:

This seems not the same as what actually logged.

+ ereport(WARNING,
+ (errmsg("PID %d is not a PostgreSQL server process", pid)));

Like commit 7fa945b857 changed, this warning message should be "PID %d is not a PostgreSQL backend process"?

+ if (SendProcSignal(pid, PROCSIG_LOG_CURRENT_PLAN, InvalidBackendId) < 0)

proc->backendId should be specified instead of InvalidBackendId, to speed up the processing in SendProcSignal()?

+ PROCSIG_LOG_CURRENT_PLAN, /* ask backend to log plan of the current query */
+volatile sig_atomic_t LogCurrentPlanPending = false;
+extern void HandleLogCurrentPlanInterrupt(void);
+extern void ProcessLogCurrentPlanInterrupt(void);

Isn't it better to use the names that are more consistent with the function name, i.e., pg_log_query_plan? For example, PROCSIG_LOG_QUERY_PLAN instead of PROCSIG_LOG_CURRENT_PLAN?

+ ereport(LOG_SERVER_ONLY,
+ errmsg("backend with PID %d is not running a query",
+ MyProcPid));

errhidestmt(true) and errhidecontext(true) need to be added, don't they? Otherwise, for example, if pg_log_query_plan() is executed after debug_query_string is set but before ActiveQueryDesc is set, STATEMENT message would be output even though the message saying "not running a query" is output. Which seems confusing.

+ hash_seq_init(&status, GetLockMethodLocalHash());
+ while ((locallock = (LOCALLOCK *) hash_seq_search(&status)) != NULL)
+ {
+ if (LOCALLOCK_LOCKTAG(*locallock) == LOCKTAG_PAGE)

Why did you use the search for local lock hash instead of IsPageLockHeld flag variable, to check whether a page lock is held or not? Because there is the corner case where the interrupt is processed after the local lock is registered into the hash but before IsPageLockHeld is enabled?

There is the case where the request to log a query plan is skipped even while the target backend is running a query. If this happens, users can just retry pg_log_query_plan(). These things should be documented?

+ ereport(LOG_SERVER_ONLY,
+ errmsg("backend with PID %d is holding a page lock. Try again",
+ MyProcPid));

It seems more proper to output this message in DETAIL or HINT, instead. So how about something like the following messages?

LOG: could not log the query plan
DETAIL: query plan cannot be logged while page level lock is being held
HINT: Try pg_log_query_plan() after a few ....

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2022-01-27 11:34:03 psql: Rename results to result when only a single one is meant
Previous Message Julien Rouhaud 2022-01-27 11:09:29 Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?