Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .
Date: 2016-06-20 06:54:03
Message-ID: CAD21AoA0KQDv5b8ot73DkvvtXMPyyauxhhsjGmNBLF=uJv=qfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 20, 2016 at 3:42 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Mon, Jun 20, 2016 at 11:48 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
> wrote:
>>
>> Hi all,
>>
>> My colleague noticed that the output of EXPLAIN ANALYZE doesn't work
>> fine for parallel seq scan.
>>
>> postgres(1)=# explain analyze verbose select count(*) from
>> pgbench_accounts ;
>> QUERY PLAN
>>
>> -----------------------------------------------------------------------------------------------------------------------------------------------------
>> Finalize Aggregate (cost=217018.55..217018.56 rows=1 width=8)
>> (actual time=2640.015..2640.015 rows=1 loops=1)
>> Output: count(*)
>> -> Gather (cost=217018.33..217018.54 rows=2 width=8) (actual
>> time=2639.064..2640.002 rows=3 loops=1)
>> Output: (PARTIAL count(*))
>> Workers Planned: 2
>> Workers Launched: 2
>> -> Partial Aggregate (cost=216018.33..216018.34 rows=1
>> width=8) (actual time=2632.714..2632.715 rows=1 loops=3)
>> Output: PARTIAL count(*)
>> Worker 0: actual time=2632.583..2632.584 rows=1 loops=1
>> Worker 1: actual time=2627.517..2627.517 rows=1 loops=1
>> -> Parallel Seq Scan on public.pgbench_accounts
>> (cost=0.00..205601.67 rows=4166667 width=0) (actual
>> time=0.042..1685.542 rows=3333333 loops=3)
>> Worker 0: actual time=0.033..1657.486 rows=3457968
>> loops=1
>> Worker 1: actual time=0.039..1702.979 rows=3741069
>> loops=1
>> Planning time: 1.026 ms
>> Execution time: 2640.225 ms
>> (15 rows)
>>
>> For example, the above result shows,
>> Parallel Seq Scan : actual rows = 3333333
>> worker 0 : actual rows = 3457968
>> worker 1 : actual rows = 3741069
>> Summation of these is 10532370, but actual total rows is 10000000.
>> I think that Parallel Seq Scan should show actual rows =
>> 10000000(total rows) or actual rows = 2800963(rows collected by
>> itself). (10000000 maybe better)
>>
>
> You have to read the rows at Parallel Seq Scan nodes as total count of rows,
> but you have to consider the loops parameter as well.
>

In following case, it look to me that no one collect the tuple.
But it's obviously incorrect, this query collects a tuple(aid = 10) actually.

postgres(1)=# explain analyze verbose select * from pgbench_accounts
where aid = 10;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..217018.43 rows=1 width=97) (actual
time=0.541..2094.773 rows=1 loops=1)
Output: aid, bid, abalance, filler
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on public.pgbench_accounts
(cost=0.00..216018.34 rows=0 width=97) (actual time=1390.109..2088.103
rows=0 loops=3)
Output: aid, bid, abalance, filler
Filter: (pgbench_accounts.aid = 10)
Rows Removed by Filter: 3333333
Worker 0: actual time=2082.681..2082.681 rows=0 loops=1
Worker 1: actual time=2087.532..2087.532 rows=0 loops=1
Planning time: 0.126 ms
Execution time: 2095.564 ms
(12 rows)

How can we consider actual rows and nloops?

Regards,

--
Masahiko Sawada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Satoshi Nagayasu 2016-06-20 07:06:04 Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .
Previous Message Amit Kapila 2016-06-20 06:42:54 Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .