Re: [PATCH] Add extra statistics to explain for Nested Loop

From: e(dot)sokolova(at)postgrespro(dot)ru
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] Add extra statistics to explain for Nested Loop
Date: 2021-03-25 09:52:42
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thank you all for your feedback and reforms.
I attach a new version of the patch with the some changes and fixes.
Here's a list of the major changes:
1) New format of extra statistics. This is now contained in a line
separate from the main statistics.

Julien Rouhaud писал 2021-02-01 08:28:
> On Thu, Jan 28, 2021 at 8:38 PM Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
> wrote:
>> postgres=# explain (analyze, verbose) select * from a,b where a.i=b.j;
>> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Nested Loop (cost=0.00..2752.00 rows=991 width=8) (actual
>> time=0.021..17.651 rows=991 loops=1)
>> Output: a.i, b.j
>> Join Filter: (a.i = b.j)
>> Rows Removed by Join Filter: 99009
>> -> Seq Scan on public.b (cost=0.00..2.00 rows=100 width=4)
>> (actual time=0.009..0.023 rows=100 loops=1)
>> Output: b.j
>> -> Seq Scan on public.a (cost=0.00..15.00 rows=1000 width=4)
>> (actual time=0.005..0.091 min_time=0.065 max_time=0.163 min_rows=1000
>> rows=1000 max_rows=1000 loops=100)
>> Output: a.i
>> Planning Time: 0.066 ms
>> Execution Time: 17.719 ms
>> (10 rows)
>> I don't like this format where the extra statistics appear in the same
>> line of existing information because the output format differs
>> depended
>> on whether the plan node's loops > 1 or not. This makes the length of
>> a
>> line too long. Also, other information reported by VERBOSE doesn't
>> change
>> the exiting row format and just add extra rows for new information.
>> Instead, it seems good for me to add extra rows for the new statistics
>> without changint the existing row format as other VERBOSE information,
>> like below.
>> -> Seq Scan on public.a (cost=0.00..15.00 rows=1000 width=4)
>> (actual time=0.005..0.091 rows=1000 loops=100)
>> Output: a.i
>> Loops: min_time=0.065 max_time=0.163 min_rows=1000
>> max_rows=1000
>> and so on. What do you think about it?

2) Correction of the case of parallel scan

>> In parallel scan, the extra statistics are not reported correctly.
>> This reports max/min rows or time of inner scan as 0 in parallel
>> workers,
>> and as a result only the leader process's ones are accounted. To fix
>> this,
>> we would change InstrAggNode as below.

3) Adding extra statistics about total number of rows (total rows).
There were many wishes for this here.

Please don't hesitate to share any thoughts on this topic.

Ekaterina Sokolova
Postgres Professional:
The Russian Postgres Company

Attachment Content-Type Size
extra_statistics_v3.patch text/x-diff 17.3 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2021-03-25 10:01:23 Re: wal stats questions
Previous Message Joel Jacobson 2021-03-25 09:48:55 Re: [PATCH] pg_permissions