Quick Links

Expose custom planning data in EXPLAIN

From:	Andrei Lepikhov <lepihov(at)gmail(dot)com>
To:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Expose custom planning data in EXPLAIN
Date:	2025-08-13 13:50:56
Message-ID:	e7cb9db0-afb6-4ba5-b86f-307683133387@gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

Background and motivation
-------------------------

This feature is inspired by the pg_overexplain, new EXPLAIN-related
hooks and recent change of the Memoize node representation (4bc62b86849).

Based on user reports from the production system, we typically receive
no more than an 'explain verbose analyse' output to identify the issue's
origin. It is obvious to me that the detailisation of the EXPLAIN format
will never be enough. Even more, each new parameter, adding more
information, also complicates life for some people, flooding the screen
with unnecessary (for them) data.

With extensible EXPLAIN options and per-node/summary hooks, we can add
as many details as needed using modules. However, there is one
limitation: we can only explore Plan and PlanState data. If something
isn't transferred from the planning stage to the plan, we won't have a
way to expose this data.
For example, it is sometimes unclear why the optimiser chose
IncrementalSort or [did not choose] HashJoin, as we don't know the
ngroups estimation used at this specific point of the plan.

Design Overview
---------------

It appears that the only two changes required to enable the feature are
a hook and a field in the Plan node. In this patch, I have chosen to add
the hook to the copy_generic_path_info routine to limit its usage for
tracking purposes only. Also, I extended its interface with the
PlannerInfo pointer, which may be helpful in many cases. The new extlist
field in the Plan structure should contain (by convention) extensible
nodes only to let modules correctly pick their data. Also, it simplifies
the logic of the node serialisation.

An additional motivation for choosing Extensible Node is its lack of
core usage, which makes it seem unpolished and requires copying a
significant amount of code to use. This patch highlights this imperfection.

Tests
-----

To demonstrate its real-life application, I added an example to
pg_overexplain. Here, a ngroups value is computed, stored in the Plan
node, and exposed in explain. Also, it is a test for the ExtensionNode
machinery.

Downsides
-----------

1. Growth of the plan node
2. Read/write extensible node - what if reading worker (or backend ?)
doesn't have the module installed?
3. The point for the hook call.

The first issue is quite limited because the only version of the plan
exists, in contrast to the multiple paths.
The second issue is a little more complicated. However, I believe the
issue could be resolved by allowing extensions to determine the logic
for serialising their ExtensibleNode.
The selection of the point for the hook appears to be quite strict. It
does not permit any extensions to alter the final plan or disrupt its
consistency, except for some cost data. However, it does allow for
tracking the decisions made during the planning phase.

See the patch attached.

--
regards, Andrei Lepikhov

Attachment	Content-Type	Size
v0-0001-Introduce-a-create-plan-hook.patch	text/plain	29.6 KB

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Eisentraut	2025-08-13 13:54:20	Re: [PATCH] Proposal: Improvements to PDF stylesheet and table column widths
Previous Message	Bertrand Drouvot	2025-08-13 13:33:13	Re: Improve LWLock tranche name visibility across backends