Re: plan_rows confusion with parallel queries

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plan_rows confusion with parallel queries
Date: 2016-11-03 14:44:13
Message-ID: CA+TgmobunWKyR9TorVeiuAE138Lzj=ss5CEHco+YX37T-7FHgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 2, 2016 at 4:00 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
>> while eye-balling some explain plans for parallel queries, I got a bit
>> confused by the row count estimates. I wonder whether I'm alone.
>
> I got confused by that a minute ago, so no you're not alone. The problem
> is even worse in join cases. For example:
>
> Gather (cost=34332.00..53265.35 rows=100 width=8)
> Workers Planned: 2
> -> Hash Join (cost=33332.00..52255.35 rows=100 width=8)
> Hash Cond: ((pp.f1 = cc.f1) AND (pp.f2 = cc.f2))
> -> Append (cost=0.00..8614.96 rows=417996 width=8)
> -> Parallel Seq Scan on pp (cost=0.00..8591.67 rows=416667 widt
> h=8)
> -> Parallel Seq Scan on pp1 (cost=0.00..23.29 rows=1329 width=8
> )
> -> Hash (cost=14425.00..14425.00 rows=1000000 width=8)
> -> Seq Scan on cc (cost=0.00..14425.00 rows=1000000 width=8)
>
> There are actually 1000000 rows in pp, and none in pp1. I'm not bothered
> particularly by the nonzero estimate for pp1, because I know where that
> came from, but I'm not very happy that nowhere here does it look like
> it's estimating a million-plus rows going into the join.

I welcome suggestions for improvement, but you will note that if the
row count didn't reflect some kind of guess about the number of rows
that each individual worker will see, the costing would be hopelessly
broken. The cost needs to reflect a guess about the time the query
will finish, not the total amount of effort expended.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-11-03 14:51:39 Re: plan_rows confusion with parallel queries
Previous Message Robert Haas 2016-11-03 14:39:38 Re: Confusing docs about GetForeignUpperPaths in fdwhandler.sgml