Re: fix cost subqueryscan wrong parallel cost

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Richard Guo <guofenglinux(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fix cost subqueryscan wrong parallel cost
Date: 2022-04-29 20:29:20
Message-ID: CAKFQuwaHhrLgvNVAUQWDMaTKg=V=4vi2P1bSFG6x__eMhZKZcQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 29, 2022 at 12:31 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
> > The fact that (baserel.rows > path->subpath->rows) here seems like a
> > straight bug: there are no filters involved in this case but in the
> > presence of filters baserel->rows should be strictly (<=
> > path->subpath->rows), right?
>
> No, because the subpath's rowcount has been derated to represent the
> number of rows any one worker is expected to process. Since the
> SubqueryScan is below the Gather, it has to represent that number
> of rows too. Like I said, this design is pretty confusing; though
> I do not have a better alternative to suggest.
>
> Anyway, after chewing on this for awhile, it strikes me that the
> sanest way to proceed is for cost_subqueryscan to compute its rows
> estimate as the subpath's rows estimate times the selectivity of
> the subqueryscan's quals (if any).

This is what Robert was getting at, and I followed-up on.

The question I ended up at is why doesn't baserel->rows already produce the
value you now propose to calculate directly within cost_subquery

set_baserel_size_estimates (multiplies rel->tuples - which is derated - by
selectivity, sets rel->rows)
set_subquery_size_estimates
rel->subroot = subquery_planner(...)
// my expectation is that
sub_final_rel->cheapest_total_path->rows is the derated number of rows;
// the fact you can reach the derated amount later by using
path->subpath->rows seems to affirm this.
sets rel->tuples from sub_final_rel->cheapest_total_path->rows)
set_subquery_pathlist (executes the sizing call stack above, then proceeds
to create_subqueryscan_path which in turn calls cost_subquery)
set_rel_size
...

> * By analogy to other sorts of relation-scan nodes. But those don't
> have any subpath that they could consult instead. This analogy is
> really a bit faulty, since SubqueryScan isn't a relation scan node
> in any ordinary meaning of that term.
>

I did observe this copy-and-paste dynamic; I take it this is why we cannot
or would not just change all of the baserel->rows usages to
path->subpath->rows.

>
> So perhaps we should do it more like the attached, which produces
> this plan for the UNION case:
>
>
The fact this changes row counts in a costing function is bothersome - it
would be nice to go the other direction and remove the if block. More
generally, path->path.rows should be read-only by the time we get to
costing. But I'm not out to start a revolution here either. But it feels
like we are just papering over a bug in how baserel->rows is computed; per
my analysis above.

In short, the solution seems like it should, and in fact does here, fix the
observed problem. I'm fine with that.

David J.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2022-04-29 20:31:59 Re: bogus: logical replication rows/cols combinations
Previous Message Andres Freund 2022-04-29 20:11:45 Re: [RFC] building postgres with meson -v8