Re: [PATCH] Add NESTED_STATEMENTS option to EXPLAIN

From: Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] Add NESTED_STATEMENTS option to EXPLAIN
Date: 2026-05-28 22:08:37
Message-ID: CAN4CZFN++9J0274Pj_cEit-uDx3AFMV8oNJ0Q172jM__dWk8RA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

+<programlisting>
+EXPLAIN (ANALYZE, NESTED_STATEMENTS, FORMAT JSON) SELECT my_function();

and

+ Each nested plan is a valid JSON document that can be parsed independently.
+ The metadata headers (statement number, nesting level, query text) and the
+ summary appear as plain text between the JSON blocks. The
+ <literal>Execution Time</literal> field is included inside each nested
+ plan's JSON structure when <literal>SUMMARY</literal> is enabled.
+ Percentages are shown only in the text-format summary, not inside the
+ structured plan objects.0

Are you sure this is correct? My expectation would be for FORMAT JSON
to output a valid, usable JSON document. E.g.:

bin/psql -X -t -A -d db -c "EXPLAIN (FORMAT JSON) SELECT ..." > file.json

writes a valid json file which I can then parse with any json parser.

If I do the same with NESTED_STATEMENTS, I get an invalid json. It's
not only a concatenation of multiple json documents, but also includes
additional completely non-json text in it. It seems difficult to
parse/use.

Same goes for XML/YAML.

+ if (!is_trigger && nested_exec_level == 1 &&
+ stmt_time > nested_slowest_time)
+ {
+ nested_slowest_time = stmt_time;
+ nested_slowest_num = nested_total_count;
+ }

Why is the slowest statement check limited to nesting level 1?

For example, see the following test case where it doesn't display any
slowest statement:

CREATE TABLE big(n int);
INSERT INTO big SELECT g FROM generate_series(1, 200000) g;
CREATE TABLE t_a(id int primary key);
CREATE TABLE t_log(id int);
INSERT INTO t_a VALUES (1);
CREATE OR REPLACE FUNCTION slow_trig() RETURNS trigger LANGUAGE plpgsql AS $$
DECLARE r bigint;
BEGIN
SELECT count(*) FROM big a, generate_series(1, 40) b INTO r;
INSERT INTO t_log VALUES (NEW.id);
RETURN NEW;
END;
$$;
CREATE TRIGGER slow_trigger AFTER UPDATE ON t_a
FOR EACH ROW EXECUTE FUNCTION slow_trig();
EXPLAIN (ANALYZE, NESTED_STATEMENTS, COSTS OFF, TIMING ON)
UPDATE t_a SET id = id WHERE id = 1;

Trigger time calculation also doesn't work for multiple nesting levels, see:

CREATE TABLE t(i int);
CREATE FUNCTION trg() RETURNS trigger LANGUAGE plpgsql AS $$
BEGIN PERFORM pg_sleep(0.05); RETURN NEW; END $$;
CREATE TRIGGER trg BEFORE INSERT ON t FOR EACH ROW EXECUTE FUNCTION trg();
CREATE FUNCTION f() RETURNS void LANGUAGE plpgsql AS $$
BEGIN INSERT INTO t VALUES(1); END $$;
EXPLAIN (ANALYZE, NESTED_STATEMENTS, COSTS OFF, TIMING OFF, BUFFERS
OFF) SELECT f();

And staying with the topic of triggers, are you sure that all
statements that are executed within triggers should get a trigger
label? Wouldn't it be better to track only directly trigger-executed
statements?

+ <literal>SUMMARY</literal> is enabled. Level-1 statements (those
+ directly inside the called function) also show a percentage of total
+ nested time. Deeper-level statements show only absolute time because
+ their time is already included in their level-1 parent's reported time.
+ Nested statements are numbered in completion order.

Wouldn't start-time ordering be better rather than completion? Right
now, it includes children before the caller, which could look
confusing. With a multi level difference between statement it is
clearer, but if a level 2 statement follows a level 1 statement, my
first thought that the level 1 statement called the level 2. But it
didn't, the next level 1 did.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2026-05-28 22:11:32 Re: should we have a fast-path planning for OLTP starjoins?
Previous Message Baji Shaik 2026-05-28 22:08:04 Re: [PATCH] Improve REPACK (CONCURRENTLY) error messages for unsupported configurations