Re: explain analyze rows=%.0f

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, vignesh C <vignesh21(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: explain analyze rows=%.0f
Date: 2022-07-07 22:20:15
Message-ID: 20220707222015.GD13040@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 07, 2022 at 04:21:37PM -0400, Robert Haas wrote:
> I mean, what I really want here if I'm honest is to not have the
> system divide the number of rows by the loop count. And it sort of
> sounds like maybe that's what you want, too. You want to know whether
> the loop count is actually zero, not whether it's close to zero when
> you divide it by some number that might be gigantic.
...
> involves a dozen or two different nested loops, and if we didn't
> insist on dividing the time by the loop count, it would be MUCH EASIER
> to figure out whether the time spent in the Index Scan is a
> significant percentage of the total time or not.

I think the guiding princible for what to do should be to reduce how much is
needed to explain about how to interpret what explain is showing...

The docs say this:
| In such cases, the loops value reports the total number of executions of the
| node, and the actual time and rows values shown are averages per-execution.
| This is done to make the numbers comparable with the way that the cost
| estimates are shown. Multiply by the loops value to get the total time
| actually spent in the node.

On Thu, Jul 07, 2022 at 01:45:19PM -0700, Peter Geoghegan wrote:
> Plus you could probably
> make some kind of concession in the direction of maintaining
> compatibility with the current approach if you had to. Right?

The minimum would be to show the information in a way that makes it clear that
it's "new style" output showing a total and not an average, so that a person
who sees it knows how to interpret it (same for the web "explain tools")

A concession would be to show the current information *plus* total/raw values.

This thread is about how to display the existing values. But note that there's
a CF entry for also collecting more values to show things like min/max rows per
loop.

https://commitfest.postgresql.org/38/2765/
Add extra statistics to explain for Nested Loop

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-07-07 22:38:23 Re: Remove support for Visual Studio 2013
Previous Message Tom Lane 2022-07-07 22:07:36 Re: Postgres picks suboptimal index after building of an extended statistics