Re: RFC: Logging plan of the running query

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Алена Рыбакина <lena(dot)ribackina(at)yandex(dot)ru>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-09-20 14:41:57
Message-ID: 38f4e19323b264e5975c60010b512b15@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2022-09-19 17:47, Алена Рыбакина wrote:
Thanks for your review and comments!

> Hi,
>
> I'm sorry,if this message is duplicated previous this one, but I'm not
> sure that the previous message is sent correctly. I sent it from email
> address a(dot)rybakina(at)postgrespro(dot)ru and I couldn't send this one email
> from those address.

I've successfully received your mail from both a(dot)rybakina(at)postgrespro(dot)ru
and lena(dot)ribackina(at)yandex(dot)ru(dot)

> I like idea to create patch for logging query plan. After reviewing
> this code and notice some moments and I'd rather ask you some
> questions.
>
> Firstly, I suggest some editing in the comment of commit. I think, it
> is
> turned out the more laconic and the same clear. I wrote it below since
> I
> can't think of any other way to add it.

> ```
> Currently, we have to wait for finishing of the query execution to
> check
> its plan.
> This is not so convenient in investigation long-running queries on
> production
> environments where we cannot use debuggers.

> To improve this situation there is proposed the patch containing the
> pg_log_query_plan()
> function which request to log plan of the specified backend process.

> By default, only superusers are allowed to request log of the plan
> otherwise
> allowing any users to issue this request could create cause lots of log
> messages
> and it can lead to denial of service.
>
> At the next requesting CHECK_FOR_INTERRUPTS(), the target backend logs
> its plan at
> LOG_SERVER_ONLY level and therefore this plan will appear in the server
> log only,
> not to be sent to the client.
Thanks, I have incorporated your comments.
Since the latter part of the original message comes from the commit
message of pg_log_backend_memory_contexts(43620e328617c), so I left it
as it was for consistency.

> Secondly, I have question about deleting USE_ASSERT_CHECKING in lock.h?
> It supposed to have been checked in another placed of the code by
> matching values. I am worry about skipping errors due to untesting with
> assert option in the places where it (GetLockMethodLocalHash)
> participates and we won't able to get core file in segfault cases. I
> might not understand something, then can you please explain to me?
Since GetLockMethodLocalHash() is only used for assertions, this is only
defined when USE_ASSERT_CHECKING is enabled.
This patch uses GetLockMethodLocalHash() not only for the assertion
purpose, so I removed "ifdef USE_ASSERT_CHECKING" for this function.
I belive it does not lead to skip errors.

> Thirdly, I have incomprehension of the point why save_ActiveQueryDesc
> is
> declared in the pquery.h? I am seemed to save_ActiveQueryDesc to be
> used
> in an once time in the ExecutorRun function and its declaration
> superfluous. I added it in the attached patch.
Exactly.

> Fourthly, it seems to me there are not enough explanatory comments in
> the code. I also added them in the attached patch.
Thanks!

| + /*
| + * Save value of ActiveQueryDesc before having called
| + * ExecutorRun_hook function due to having reset by
| + * AbortSubTransaction.
| + */
| +
| save_ActiveQueryDesc = ActiveQueryDesc;
| ActiveQueryDesc = queryDesc;
|
| @@ -314,6 +320,7 @@ ExecutorRun(QueryDesc *queryDesc,
| else
| standard_ExecutorRun(queryDesc, direction, count,
execute_once);
|
| + /* We set the actual value of ActiveQueryDesc */
| ActiveQueryDesc = save_ActiveQueryDesc;

Since these processes are needed for nested queries, not only for
AbortSubTransaction[1], added comments on it.

| +/* Function to handle the signal to output the query plan. */
| extern void HandleLogQueryPlanInterrupt(void);

I feel this comment is unnecessary since the explanation of
HandleLogQueryPlanInterrupt() is written in explain.c and no functions
in explain.h have comments in it.

> Lastly, I have incomprehension about handling signals since have been
> unused it before. Could another signal disabled calling this signal to
> log query plan? I noticed this signal to be checked the latest in
> procsignal_sigusr1_handler function.
Are you concerned that one signal will not be processed when multiple
signals are sent in succession?
AFAIU both of them are processed since SendProcSignal flags
ps_signalFlags for each signal.

```
SendProcSignal(pid_t pid, ProcSignalReason reason, BackendId backendId)
{
volatile ProcSignalSlot *slot;
...(snip)...
278 if (slot->pss_pid == pid)
279 {
280 /* Atomically set the proper flag */
281 slot->pss_signalFlags[reason] = true;
282 /* Send signal */
283 return kill(pid, SIGUSR1);
```

Comments of ProcSignalReason also say 'We can cope with concurrent
signals for different reasons'.

```C
/*
* Reasons for signaling a Postgres child process (a backend or an
auxiliary
* process, like checkpointer). We can cope with concurrent signals
for different
* reasons. However, if the same reason is signaled multiple times in
quick
* succession, the process is likely to observe only one notification
of it.
* This is okay for the present uses.
...
typedef enum
{
PROCSIG_CATCHUP_INTERRUPT, /* sinval catchup interrupt */
PROCSIG_NOTIFY_INTERRUPT, /* listen/notify interrupt */
PROCSIG_PARALLEL_MESSAGE, /* message from cooperating parallel
backend */
PROCSIG_WALSND_INIT_STOPPING, /* ask walsenders to prepare for
shutdown */
PROCSIG_BARRIER, /* global barrier interrupt */
PROCSIG_LOG_MEMORY_CONTEXT, /* ask backend to log the memory
contexts */
PROCSIG_LOG_QUERY_PLAN, /* ask backend to log plan of the current
query */
...
} ProcSignalReason;
```

[1]
https://www.postgresql.org/message-id/8b53b32f-26cc-0531-4ac0-27310e0bef4b%40oss.nttdata.com

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v24-0001-log-running-query-plan.patch text/x-diff 25.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2022-09-20 14:55:23 Re: why can't a table be part of the same publication as its schema
Previous Message Alvaro Herrera 2022-09-20 14:39:46 Re: cataloguing NOT NULL constraints