BUG #15707: Incorrect timing information in EXPLAIN ANALYZE output

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: david(at)swarm64(dot)com
Subject: BUG #15707: Incorrect timing information in EXPLAIN ANALYZE output
Date: 2019-03-20 16:26:48
Message-ID: 15707-caa4da87391c00ab@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 15707
Logged by: David Geier
Email address: david(at)swarm64(dot)com
PostgreSQL version: 11.2
Operating system: Ubuntu 18.04
Description:

Hi all,

While looking carefully at the EXPLAIN ANALYZE output we came across numbers
for "actual total time" (the second number of the "actual time" range) which
we believe don't add up.

The instrumentation code starts measuring execution time of a plan node
before the corresponding ExecProcNode() function pointer is invoked and
stops after it has completed. This happens recursively on the query plan
while pulling up tuples Volcano-style.

Accordingly, the "actual total time" of any plan node can never be smaller
than the sum of "actual total time"s of all child plan nodes. If you will,
the "actual total time" of a plan node is its inclusive execution time. The
difference with its child plan nodes is its exclusive execution time.

In the following plan you can see that this doesn't hold even for a simple
query like

EXPLAIN ANALYZE SELECT COUNT(*) FROM orders WHERE o_orderkey < 10;

Finalize Aggregate (cost=184906.47..184906.48 rows=1 width=8) (actual
time=283.924..283.925 rows=1 loops=1) <-- actual total time = 283.925ms
-> Gather (cost=184906.05..184906.46 rows=4 width=8) (actual
time=156.571..308.595 rows=4 loops=1) <-- actual total time = 308.595ms
Workers Planned: 4
Workers Launched: 4
-> Partial Aggregate (cost=184406.05..184406.06 rows=1 width=8)
(actual time=181.446..181.446 rows=1 loops=4)
-> Parallel Foreign Scan on orders (cost=500.00..184406.00
rows=19 width=0) (actual time=147.988..181.438 rows=2 loops=4)
Rows in table: 15000000
Rows in snapshot: 15000000
Cluster sets in snapshot: 4458
Rows returned: 7
Planning Time: 2.565 ms
Execution Time: 308.833 ms

Looking through the code we came across ExecShutdownNode(), which uses the
instrumentation functions to include shutdown time into the total execution
time. Though the big difference to the use in ExecProcNode() is that
ExecShutdownNode() only invokes InstrStartNode() AFTER it has already
recursed into its children whereas ExecProcNode() calls it before. This
results in the measured shutdown time for a plan node being exclusive
instead of inclusive; but its still being added to the timing information
obtained from executing ExecProcNode().

We believe a fix to that problem is to move

if (node->instrument && node->instrument->running)
InstrStartNode(node->instrument);

above the recursion

planstate_tree_walker(node, ExecShutdownNode, NULL);

Regards,
David

Browse pgsql-bugs by date

  From Date Subject
Next Message Julian Schauder 2019-03-20 16:48:46 Re: BUG #15703: Segfault in cancelled CALL-Statements
Previous Message Tom Lane 2019-03-20 16:07:47 Re: BUG #15703: Segfault in cancelled CALL-Statements