Quick Links

Re: RFC: Logging plan of the running query

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc:	Atsushi Torikoshi <torikoshia(dot)tech(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, samimseih(at)gmail(dot)com, destrex271(at)gmail(dot)com
Subject:	Re: RFC: Logging plan of the running query
Date:	2025-09-16 15:05:57
Message-ID:	CA+TgmoYmWZ-PjEkhu6BJj=BGFtvKdfEE3up9e4nvfCLupE9h3A@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Sep 16, 2025 at 9:30 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> I'm not sure if this is the best approach, but I changed them to log
> nothing in these cases.
> In addition to the reason you mentioned, I found that when
> pg_log_query_plan() is executed repeatedly at very short intervals
> (e.g., every 0.01 seconds), ereport() could lead to
> ProcessLogQueryPlanInterrupt() being invoked again inside errfinish via
> CFI(), which resulted in a stack overflow.
> While this could be processed with check_stack_depth(), it doesn’t seem
> worthwhile to use it just to emit a message of limited usefulness.

Yes, it seems as though we should set a flag to prevent reentrancy
here -- if we are already in the code path where we log the query
plan, we shouldn't accept an interrupt telling us to log the query
plan. I haven't looked at the updated patch so maybe that's not
necessary for some reason, but in general reentrancy is a concern with
features of this type.

> Additionally, there are other cases where the plan cannot be logged
> (e.g., when the query is in a state which ExecProcNode will not be
> invoked any further).
> So instead, I added the following note to the documentation:
>
> Note that depending on the execution state of the query,
> it may not be possible to log the plan.

This seems to me to be too vague to be useful. I expect readers to
read this to mean "sometimes this feature may not work." But that
seems too pessimistic if the only case in which it doesn't work is
when we are the very tail end of the query and it was just about to
stop running, we probably don't need to document anything, as it will
happen rarely and will look about the same as if the query finished
very slightly earlier and we just missed it. If there are more cases
than that in which this feature won't work, we should talk about
those, and maybe fix them.

> > Also, I think that we don't normally put the PID of the
> > process that is performing an action in the primary error message,
> > because the user can include %p in log_line_prefix if they so desire.
>
> That makes sense, but
> I had used the following message from pg_log_backend_memory_contexts()
> as a reference
> and put the PID in the message:
>
> errmsg("logging memory contexts of PID %d", MyProcPid)));
>
> Since both pg_log_query_plan() and pg_log_backend_memory_contexts()
> output information about another other backend, it feels slightly odd
> that their log messages would be inconsistent.
> Perhaps should we consider changing pg_log_backend_memory_contexts() for
> consistency as well?

Yeah, maybe. I don't know what the reason behind the unusual framing
of that message is, so perhaps there is an argument for being
consistent with it, rather than changing it. However, it looks unusual
to me.

> With this change, while running the query below, pg_log_query_plan() is
> now able to output the plan for SELECT gen_subtxn(g) FROM
> generate_series(1,1000000) g as below:

Excellent!

> This happens when the plan-logging function is called after an ERROR is
> raised but before the transaction state stack is popped.
> So in the attached patch, as in the previous version, I reset QueryDesc
> to NULL during that interval.

I think a better fix would be to dump nothing when the current
(sub)transaction is (sub)aborted. If you don't do that, then you're
hoping that you can manage to set QueryDesc to null quickly enough
after the transaction is no longer in a valid state, which does not
seem like a great way to make things robust.

> > +-- Note that we're using a slightly more complex query here because
> > +-- a simple 'SELECT pg_log_query_plan(pg_backend_pid())' would finish
> > +-- before it reaches the code path that actually outputs the plan.
> > +SELECT result FROM (SELECT pg_log_query_plan(pg_backend_pid()))
> > result;
> > + result
> > +--------
> > + (t)
> > +(1 row)
> >
> > I seriously doubt that this will be stable across the entire
> > buildfarm. You're going to need a different approach here.
>
> Changed it to the following query:
>
> WITH t AS MATERIALIZED (SELECT pg_log_query_plan(pg_backend_pid()))
> SELECT * FROM t;

I don't see why that wouldn't have a race condition?

> Also since when the target query is using CTE, pg_log_query_plan() had
> few chance to log the plan, attached patch added T_CteScan handling.

Shouldn't all node types be handled equally?

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Re: RFC: Logging plan of the running query at 2025-09-16 13:30:31 from torikoshia

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Srirama Kucherlapati	2025-09-16 15:13:30	RE: AIX support
Previous Message	Robert Haas	2025-09-16 14:42:57	Re: Improving the names generated for indexes on expressions