Re: Re: fix cost subqueryscan wrong parallel cost

From: "bucoo(at)sohu(dot)com" <bucoo(at)sohu(dot)com>
To: robertmhaas <robertmhaas(at)gmail(dot)com>
Cc: "Richard Guo" <guofenglinux(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: fix cost subqueryscan wrong parallel cost
Date: 2022-04-21 06:38:22
Message-ID: 202204211438224657779@sohu.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > for now fuction cost_subqueryscan always using *total* rows even parallel
> > path. like this:
> >
> > Gather (rows=30000)
> > Workers Planned: 2
> > -> Subquery Scan (rows=30000) -- *total* rows, should be equal subpath
> > -> Parallel Seq Scan (rows=10000)
>
> OK, that's bad.
>
> > Maybe the codes:
> >
> > /* Mark the path with the correct row estimate */
> > if (param_info)
> > path->path.rows = param_info->ppi_rows;
> > else
> > path->path.rows = baserel->rows;
> >
> > should change to:
> >
> > /* Mark the path with the correct row estimate */
> > if (path->path.parallel_workers > 0)
> > path->path.rows = path->subpath->rows;
> > else if (param_info)
> > path->path.rows = param_info->ppi_rows;
> > else
> > path->path.rows = baserel->rows;
>
> Suppose parallelism is not in use and that param_info is NULL. Then,
> is path->subpath->rows guaranteed to be equal to baserel->rows? If
> yes, then we don't need to a three-part if statement as you propose
> here and can just change the "else" clause to say path->path.rows =
> path->subpath->rows. If no, then your change gives the wrong answer.
I checked some regress test, Sometimes subquery scan have filter,
so path->subpath->row guaranteed *not* to be equal to baserel->rows.
If the first patch is false, I don't known how to fix this,
looks like need someone's help.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2022-04-21 06:49:45 Re: postgres_fdw: batch inserts vs. before row triggers
Previous Message Paul Guo 2022-04-21 06:29:59 Two small issues related to table_relation_copy_for_cluster() and CTAS with no data.