| From: | Lukas Fittl <lukas(at)fittl(dot)com> |
|---|---|
| To: | torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> |
| Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, Atsushi Torikoshi <torikoshia(dot)tech(at)gmail(dot)com>, samimseih(at)gmail(dot)com, destrex271(at)gmail(dot)com, robertmhaas(at)gmail(dot)com |
| Subject: | Re: RFC: Logging plan of the running query |
| Date: | 2026-06-08 06:08:13 |
| Message-ID: | CAP53Pkw8CX0RQY474McHyc1N8p331-asEBAEGu0X+bJX-VB8jg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On Mon, Mar 16, 2026 at 8:14 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> Rebased the patch.
I was reminded about this patch at PG DATA in Chicago last week, and
figured I'd see if I can contribute a review / help move this forward
in PG20.
First of all, find attached a v54 that is rebased over changes late in
the PG19 cycle, with other minimal fixes:
- Add a missing call to explain_per_plan_hook in ExplainStringAssemble
(this hook was newly added after your last patch version)
- The TAP test number (11) was overlapping with an existing test
(011_lock_stats) that was recently added, renumbered to
014_pg_log_query_plan.pl
- Add missing reference for the TAP test in Meson builds
- Adjust auto_explain.c to not include "dynamic_explain.h" (its not
needed, ExplainStringAssemble is defined in explain.h which is already
included)
- pgindent run to fix minor whitespace issues in dynamic_explain.c
I've also read through the other thread re: progressive explain, and
it seems like this infrastructure here to log the current plan
(without ANALYZE) would be required in any case (and I think the
overall approach makes sense), and then separately we can solve the
progressive explain's parallel worker and other design issues around
execution statistics.
In terms of code review:
> diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
> index aafc53e0164..09e51b3d75a 100644
> --- a/src/backend/access/transam/xact.c
> +++ b/src/backend/access/transam/xact.c
> @@ -37,6 +37,7 @@
> #include "catalog/pg_enum.h"
> #include "catalog/storage.h"
> #include "commands/async.h"
> +#include "commands/dynamic_explain.h"
> #include "commands/tablecmds.h"
> #include "commands/trigger.h"
> #include "common/pg_prng.h"
> @@ -217,6 +218,7 @@ typedef struct TransactionStateData
> bool parallelChildXact; /* is any parent transaction parallel? */
> bool chain; /* start a new block after this one */
> bool topXidLogged; /* for a subxact: is top-level XID logged? */
> + QueryDesc *queryDesc; /* my current QueryDesc */
> struct TransactionStateData *parent; /* back link to parent */
> } TransactionStateData;
Somehow putting the current QueryDesc into TransactionStateData seemed
a bit odd to me, and I briefly experimented with putting this into the
active portal instead - but that doesn't work with subtransactions
that don't have their own portal, which presumably is how you landed
on this design. So I think this makes sense as-is, given those
challenges.
> @@ -2953,6 +2977,9 @@ AbortTransaction(void)
> /* Reset snapshot export state. */
> SnapBuildResetExportedSnapshotState();
>
> + /* Reset current query plan state used for logging. */
> + SetCurrentQueryDesc(NULL);
> +
> /*
> * If this xact has started any unfinished parallel operation, clean up
> * its workers and exit parallel mode. Don't warn about leaked resources.
> @@ -5352,6 +5379,13 @@ AbortSubTransaction(void)
> /* Reset logical streaming state. */
> ResetLogicalStreamingState();
>
> + /*
> + * Reset current query plan state used for logging. Note that even after
> + * this reset, it's still possible to obtain the parent transaction's
> + * query plans, since they are preserved in standard_ExecutorRun().
> + */
> + SetCurrentQueryDesc(NULL);
> +
> /*
> * No need for SnapBuildResetExportedSnapshotState() here, snapshot
> * exports are not supported in subtransactions.
It seems better to me if we avoided overly specific code comments in
xact.c that talk about plan logging behavior, i.e. look at this more
as a general purpose "what's the current queryDesc for the current
transaction" facility.
> diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c
> index 7e40b852517..96cb22daf80 100644
> --- a/src/backend/executor/execProcnode.c
> +++ b/src/backend/executor/execProcnode.c
> @@ -441,27 +443,43 @@ ExecSetExecProcNode(PlanState *node, ExecProcNodeMtd function)
> ...
> static TupleTableSlot *
> ExecProcNodeFirst(PlanState *node)
> {
> ...
> + if (LogQueryPlanPending)
> + LogQueryPlan();
> +
> /*
> * If instrumentation is required, change the wrapper to one that just
> * does instrumentation. Otherwise we can dispense with all wrappers and
> * have ExecProcNode() directly call the relevant function from now on.
> */
> - if (node->instrument)
> + else if (node->instrument)
> node->ExecProcNode = ExecProcNodeInstr;
> else
> node->ExecProcNode = node->ExecProcNodeReal;
I'm not sure if this was discussed earlier already, but I find the
flow here a bit hard to follow in regards to how we execute the actual
node function again.
Specifically, since this uses "else if" we don't touch ExecProcNode
when LogQueryPlanPending = true, and since node->ExecProcNode still
points to ExecProcNodeFirst, the function basically calls itself. On
the second go around we assume that LogQueryPlanPending is no longer
set, so we get back to the intended ExecProcNode *if*
LogQueryPlanPending was set to false (which it is in LogQueryPlan, but
that's not easy to see from just that function).
Why not do it like this:
if (LogQueryPlanPending)
LogQueryPlan();
/* Unmodified existing code */
if (node->instrument)
node->ExecProcNode = ExecProcNodeInstr;
else
node->ExecProcNode = node->ExecProcNodeReal;
(it seems like the patch suggestion that Robert had earlier did this,
but not clear why it evolved from that?)
Thanks,
Lukas
--
Lukas Fittl
| Attachment | Content-Type | Size |
|---|---|---|
| v54-0001-Add-function-to-log-the-plan-of-the-currently-ru.patch | application/octet-stream | 38.7 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ashutosh Sharma | 2026-06-08 06:08:48 | Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication |
| Previous Message | Amit Kapila | 2026-06-08 06:07:30 | Re: Use correct type for catalog_xmin |