Re: [PATCH] Add NESTED_STATEMENTS option to EXPLAIN

From: Mohamed ALi <moali(dot)pg(at)gmail(dot)com>
To: Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] Add NESTED_STATEMENTS option to EXPLAIN
Date: 2026-05-26 18:32:26
Message-ID: CAGnOmWpqEjK39g6X5cHD3nWa82eAg0dDMJv1uPTOWYCXRJjMQA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Subject: [PATCH v4] Add NESTED_STATEMENTS option to EXPLAIN

Hi,

Attached is v4 of the patch adding NESTED_STATEMENTS as a core EXPLAIN
option. This version adds significant new features on top of v3

Changes since v3:
- Added SHOW_NESTED integer option to limit displayed plans
- Added Nested Statements Summary (total, max depth, timing, slowest)
- Added per-statement execution time percentage (level-1 only, %.1f)
- Added trigger detection via GetMyTriggerDepth() from trigger.c:
both BEFORE and AFTER triggers annotated with "(trigger)",
parent statements show "Trigger X: time=Y calls=Z" lines,
separate Total Trigger Time in summary (% of nested time)
- Total Nested Time and Total Trigger Time sum only level-1 statements
(deeper levels' time is already included in their level-1 parent)
- Works directly on DML with triggers and inside functions
(reveals trigger activity unreachable by built-in reporting)
- Query Identifier shown with VERBOSE (correlates with pg_stat_statements)
- Updated documentation with all new features and examples

== Feature Overview ==

1. SHOW_NESTED Option
---------------------

Controls how many nested plans are displayed. All statements are still
captured internally for the summary, but only the first N plans appear
in the output. Use SHOW_NESTED 0 for summary-only mode.

Note: This is purely a display control, it does not limit execution
or capture depth. PostgreSQL already has max_stack_depth to prevent
runaway recursion at the execution level. SHOW_NESTED addresses a
different problem: managing output size when a function legitimately
executes many statements.

EXPLAIN (ANALYZE, NESTED_STATEMENTS, SHOW_NESTED 2) SELECT my_func();

Nested Plans (showing 2 of 5):

Nested Statement #1 (level 1):
Query Text: SELECT COUNT(*) FROM t WHERE id = 1
...
Execution Time: 0.011 ms (36.0%)

Nested Statement #2 (level 1):
Query Text: SELECT COUNT(*) FROM t WHERE id = 2
...
Execution Time: 0.010 ms (33.0%)

Nested Statements Summary:
Total: 5 statements, max depth 1
Total Execution Time: 0.353 ms
Total Nested Time: 0.030 ms (9.0% of total time)
Slowest Statement: #1 (0.011 ms, 36.0%)

SHOW_NESTED 0 shows only the summary without any individual plans.
Requires NESTED_STATEMENTS to be enabled.

2. Nested Statements Summary
----------------------------

Thanks to Ryan Booz and Lukas Fittl for the discussion at PGCon DEV
2026 which inspired the Nested Statements Summary feature.

A summary section appears after the nested plans showing key metrics:

Nested Statements Summary:
Total: 5 statements (3 direct, 2 trigger-fired), max depth 3
Total Execution Time: 1.300 ms
Total Nested Time: 1.001 ms (77.0% of total time)
Total Trigger Time: 0.125 ms (12.5% of nested time)
Slowest Statement: #2 (0.800 ms, 79.9%)

- Total Execution Time: the main query's wall-clock execution time
(same value as "Execution Time" in the main plan)
- Total Nested Time: sum of level-1 direct (non-trigger) statement
times, with percentage of total execution time
- Total Trigger Time: sum of level-1 trigger-fired statement times,
shown as percentage of nested time (only appears when triggers fire)
- Slowest Statement: identifies the bottleneck immediately
- Controlled by the SUMMARY option (on by default with ANALYZE)

The percentage answers: "What fraction of total execution time is
accounted for by the captured SQL statements?"

Example A — High % (SQL-heavy):
Total Execution Time: 50.000 ms
Total Nested Time: 48.500 ms (97.0% of total time)
→ Almost all time is in SQL. Use per-statement % to find the
slow query (e.g., a missing index on a WHERE clause).

Example B — Low % (overhead-heavy):
Total Execution Time: 10.000 ms
Total Nested Time: 0.500 ms (5.0% of total time)
→ SQL finishes in 0.5ms but the function takes 10ms total.
The gap (9.5ms) is PL/pgSQL interpreter overhead: loops,
conditionals, variable assignments, SPI context switches,
exception block setup. Consider rewriting as a single SQL
statement or reducing loop iterations.

3. Per-Statement Execution Time with Percentage
------------------------------------------------

Each level-1 nested statement shows its execution time and what fraction
of the total nested time it represents. Deeper-level statements show
only absolute time (their time is already included in their level-1
parent):

Nested Statement #1 (level 1):
Query Text: SELECT COUNT(*) FROM products
Aggregate (actual time=0.200..0.201 rows=1.00 loops=1)
-> Seq Scan on products (...)
Execution Time: 0.201 ms (20.1%)

Nested Statement #2 (level 1):
Query Text: UPDATE products SET price = price * 1.10 WHERE ...
Update on products (actual time=0.800..0.800 rows=0.00 loops=1)
-> Seq Scan on products (...)
Execution Time: 0.800 ms (79.9%)

Percentages use one decimal place to avoid misleading rounding
(e.g., 99.8% instead of 100% when there are multiple statements).
Level-1 percentages always sum to ≤100%.

4. Trigger Detection
--------------------

Both BEFORE and AFTER trigger-fired statements are automatically
detected and annotated. Detection uses GetMyTriggerDepth() a new
C-callable getter for the existing MyTriggerDepth variable in trigger.c
(the same variable backing pg_trigger_depth() SQL function).

EXPLAIN (ANALYZE, NESTED_STATEMENTS) SELECT process_order(1);

Nested Statement #1 (level 2, trigger):
Query Text: INSERT INTO audit_log (order_id, action) VALUES (...)
Insert on audit_log (actual time=0.522..0.522 rows=0.00 loops=1)
Execution Time: 0.522 ms

Nested Statement #2 (level 1):
Query Text: INSERT INTO orders VALUES (p_id, 99.99, 'new')
Insert on orders (actual time=0.842..0.842 rows=0.00 loops=1)
-> Result (actual time=0.001..0.001 rows=1.00 loops=1)
Trigger order_audit_trigger: time=0.522 calls=1
Execution Time: 1.520 ms (96.2%)

Nested Statements Summary:
Total: 4 statements (2 direct, 2 trigger-fired), max depth 2
Total Execution Time: 1.784 ms
Total Nested Time: 1.580 ms (88.6% of total time)
Total Trigger Time: 0.527 ms (33.3% of nested time)
Slowest Statement: #2 (1.520 ms, 96.2%)

Key design decisions:
- Detection uses GetMyTriggerDepth() > 0, which catches both BEFORE
and AFTER triggers (MyTriggerDepth is incremented in ExecCallTriggerFunc
before the trigger function runs, regardless of trigger timing)
- pg_trigger_depth() SQL function refactored to use the same getter
(zero behavior change, just shares the accessor)
- Trigger statements show no percentage (their time is already included
in the parent statement's time, they're not additive)
- Total Trigger Time is shown as % of nested time, telling users how
much of their SQL time is actually trigger overhead
- Summary shows "(N direct, M trigger-fired)" breakdown
- When no triggers are involved, trigger-related output is suppressed
- Nested statements that fire triggers show "Trigger X: time=Y calls=Z"
lines in their plan output (matching top-level EXPLAIN behavior)
- This also reveals trigger activity inside functions, which PostgreSQL's
built-in "Trigger X: time=Y calls=Z" reporting cannot reach (it only
reports triggers on the top-level statement)

5. Direct DML Trigger Visibility
---------------------------------

NESTED_STATEMENTS works directly on DML statements that fire triggers,
without needing to wrap anything in a function:

EXPLAIN (ANALYZE, NESTED_STATEMENTS)
UPDATE emp SET salary = salary * 1.10 WHERE salary < 65000;

Nested Statement #1 (level 2, trigger):
Query Text: INSERT INTO notifications VALUES (...)
...
Execution Time: 0.415 ms

Nested Statement #2 (level 1, trigger):
Query Text: INSERT INTO emp_audit VALUES (OLD.id, OLD.salary, NEW.salary)
...
Execution Time: 1.221 ms

Nested Statements Summary:
Total: 4 statements (0 direct, 4 trigger-fired), max depth 2
Total Execution Time: 5.440 ms
Total Trigger Time: 1.692 ms
Slowest Statement: #2 (1.221 ms)

This reveals cascading trigger chains: the UPDATE fires an audit
trigger (level 1), which inserts into emp_audit, which fires a
notification trigger (level 2). Users can immediately see why their
simple UPDATE is slow.

6. Query Identifier Integration (pg_stat_statements)
-----------------------------------------------------

When VERBOSE is enabled and compute_query_id is active, each nested
statement shows its Query Identifier. Repeated queries (same SQL
with different parameters) share the same identifier, allowing users
to correlate nested plans with pg_stat_statements entries:

SET compute_query_id = on;
EXPLAIN (ANALYZE, NESTED_STATEMENTS, VERBOSE, COSTS OFF)
SELECT process_orders();

Nested Statement #1 (level 1):
Query Text: SELECT COUNT(*) FROM orders
Aggregate (...)
Query Identifier: 213668903401070912
Execution Time: 0.012 ms (0.2%)

Nested Statement #3 (level 2):
Query Text: UPDATE orders SET total = total * (1 - p_pct/100)
WHERE id = p_id
Update on orders (...)
Query Identifier: -4236462531438235649
Trigger audit_trig: time=1.110 calls=1
Execution Time: 1.172 ms (15.2%)

Users can then query: SELECT * FROM pg_stat_statements
WHERE queryid = -4236462531438235649;

This requires no additional code — it works because ExplainPrintPlan
already shows Query Identifier when VERBOSE is enabled, and SPI
computes queryIds for nested statements when compute_query_id is on.

== Implementation Notes ==

- Trigger detection uses GetMyTriggerDepth() a new getter function
added to trigger.c that exposes the existing MyTriggerDepth variable.
pg_trigger_depth() SQL function refactored to use the same getter.
Detects both BEFORE and AFTER triggers with a single check.
- Each nested statement's plan includes ExplainPrintTriggers() output,
showing which triggers it fired (matches top-level EXPLAIN behavior)
- Total Nested Time only sums level-1 direct (non-trigger) statements;
deeper levels show absolute time but are not added to the total
(their time is already included in their level-1 parent's time)
- Total Trigger Time similarly sums only level-1 trigger statements
to avoid overcounting with cascading triggers
- SHOW_NESTED controls display only; all statements are still captured
for accurate summary statistics regardless of the display limit
- The show_nested field defaults to -1 (show all); 0 means summary only
- Structured formats (JSON/XML/YAML) include Execution Time as a typed
numeric field inside each nested plan's structure
- This feature reveals trigger activity inside functions that PostgreSQL's
built-in trigger reporting ("Trigger X: time=Y calls=Z") cannot reach

== Test Suite ==

The patch includes a comprehensive test script with 41 test cases
covering all features, edge cases, and error handling. The full
output is attached so reviewers can see actual behavior without
needing to apply the patch, and to help identify any test cases
I may have missed:

comprehensive_nested_statements_test_v4.sql (test script)
test_output_all_v4.txt (full output)

Test cases:
1. Validation (NESTED_STATEMENTS requires ANALYZE)
2. Simple PL/pgSQL function (all level 1)
3. PERFORM pattern (creates deeper nesting levels)
4. Expression assignment (stays at same level)
5. Side-by-side comparison of both patterns
6. SQL function nesting (true SQL nesting)
7. Three-level chain with PERFORM
8. Recursive function (increasing levels)
9. Exception handling blocks
10. No nested statements (plain query)
11. Trigger-fired nested statements (with detection)
12. Combined with VERBOSE and BUFFERS
13. Statement numbering = completion order (triggers demo)
14. BEGIN/ROLLBACK safety pattern
15. Error during EXPLAIN does not crash (Bug 1 fix)
16. Nested EXPLAIN does not crash (Bug 2 fix)
17. Memory context cleanup (Bug 3 fix)
18. Memory context does not grow across repeated calls
19. Stress test - 50 nested statements
20. Execution Time per nested statement (SUMMARY default)
21. SUMMARY OFF hides timing, percentages, and summary section
22. Structured output - JSON format
23. Structured output - XML format
24. Structured output - YAML format
25. SHOW_NESTED - limit displayed plans
26. SHOW_NESTED 0 - summary only
27. SHOW_NESTED validation (requires NESTED_STATEMENTS)
28. Nested Statements Summary with percentages
29. Statement timeout - hooks cleaned up
30. Dynamic SQL - query text captured correctly
31. Infinite recursion - graceful error (max_stack_depth protects
execution; hooks cleaned up via PG_FINALLY, server continues)
32. auto_explain compatibility
33. Deep nesting (20 levels)
34. All EXPLAIN options combined with SHOW_NESTED
35. Performance - zero overhead when disabled
36. Interpreting Total Nested Time percentage
37. Multiple triggers on separate tables
38. Direct DML with triggers (BEFORE + AFTER, cascading, rejection)
39. BEFORE and AFTER triggers in function context
40. Query Identifier integration (pg_stat_statements)
41. Level-1 only counting (no double-counting with deep nesting)

== Open Questions ==

1. Should the per-statement percentage be available in JSON format as
a separate "Execution Time Percentage" field? Currently it's only
shown in TEXT format and the summary.

2. Should there be a MAX_DEPTH option to limit capture by nesting
level (complementing SHOW_NESTED which limits by count)?

Mohamed Ali
AWS RDS

Attachment Content-Type Size
test_output_all_v4.txt text/plain 206.8 KB
v4-0001-Add-NESTED_STATEMENTS-option-to-EXPLAIN.patch application/octet-stream 40.5 KB
comprehensive_nested_statements_test_v4.sql application/octet-stream 72.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Adam Brusselback 2026-05-26 18:53:28 Re: [Patch] Add WHERE clause support to REFRESH MATERIALIZED VIEW
Previous Message Tomas Vondra 2026-05-26 18:12:35 Re: Changing the state of data checksums in a running cluster