From: | Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | maciek(at)sakrejda(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: V18 change on EXPLAIN ANALYZE |
Date: | 2025-09-27 00:31:43 |
Message-ID: | CAOtHd0Ame_kKndkjBzKPpCFoy6x3HoYrf0DeAoufT0ykNuDPEg@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Sep 26, 2025 at 2:12 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com> writes:
> > The page you link says
>
> > In some query plans, it is possible for a subplan node to be
> > executed more than once. For example, the inner index scan will be
> > executed once per outer row in the above nested-loop plan. In such
> > cases, the loops value reports the total number of executions of the
> > node, and the actual time and rows values shown are averages
> > per-execution. This is done to make the numbers comparable with the
> > way that the cost estimates are shown. Multiply by the loops value to
> > get the total time actually spent in the node. In the above example,
> > we spent a total of 0.030 milliseconds executing the index scans on
> > tenk2.
>
> > in the second paragraph after the example in this section. Do you
> > think that's not sufficiently clear?
>
> It's not wrong, but it feels a little incomplete now. Maybe change
> the last two sentences to
>
> Multiply by the loops value to get the total time actually spent in
> the node and the total number of rows processed by the node across all
> executions. In the above example, we spent a total of 0.030
> milliseconds executing the index scans on tenk2, and they handled a
> total of 10 rows.
>
> A bigger gap in perform.sgml is that it doesn't address parallel
> query cases at all AFAICS. I think that was one of the main drivers
> of this change, so it feels a little sad that it's not covered here.
Fair point. I included your proposed change and took a stab at briefly
covering parallelism in the attached (admittedly, my understanding of
how that works is a little shaky, so apologies if I'm way off on some
of this).
However, to get a parallel query in the regression database (I chose
EXPLAIN ANALYZE SELECT * FROM tenk2), I had to change some settings:
SET min_parallel_table_scan_size = 0;
SET parallel_tuple_cost = 0;
SET parallel_setup_cost = 0;
Should I mention that in the example? Or should I generate a bigger
table so using these is not necessary? If we say nothing and use the
example, I think it may be confusing if someone wants to use the
example as a starting point for their own exploration of how this
works. Or is there a better query that works out of the box and does
not need changes to the settings?
It also seems like the EXPLAIN ANALYZE section is getting a little
unwieldy. Should we subdivide it, or is this still okay?
Thanks,
Maciek
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Improve-EXPLAIN-docs.patch | text/x-patch | 5.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | SATYANARAYANA NARLAPURAM | 2025-09-27 08:23:47 | Re: Sending unflushed WAL in physical replication |
Previous Message | Jacob Champion | 2025-09-27 00:06:43 | Re: test_json_parser/002_inline is kind of slow |