Re: fix cost subqueryscan wrong parallel cost

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Richard Guo <guofenglinux(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fix cost subqueryscan wrong parallel cost
Date: 2022-04-29 14:39:39
Message-ID: CAKFQuwZtpka_aVE8d4-DT=twRSg1cGtdun5wddZ4Cpv9O+zxXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 29, 2022 at 7:02 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Richard Guo <guofenglinux(at)gmail(dot)com> writes:
> > Currently subquery scan is using rel->rows (if no parameterization),
> > which I believe is not correct. That's not the size the subquery scan
> > node in each worker needs to handle, as the rows have been divided
> > across workers by the parallel-aware node.
>
> Really? Maybe I misunderstand the case under consideration, but
> what I think will be happening is that each worker will re-execute
> the pushed-down subquery in full. Otherwise it can't compute the
> correct answer. What gets divided across the set of workers is
> the total *number of executions* of the subquery, which should be
> independent of the number of workers, so that the cost is (more
> or less) the same as the non-parallel case.
>
>
I've been operating under the belief that a subquery node (or, rather, all
nodes) in a path get their row count estimates from self-selectivity * rows
provided by the next node down in the path. In a partial path, eventually
the parallel aware node at the bottom of the path divides its row count
estimate by the parallel divisor (2.4 for two workers). That value then
returns back up through the path until it hits a gather. Every node in
between, which are almost exclusively parallel-safe (not parallel aware),
can just use the row count being percolated up the path (i.e., they get the
divided row count in their partial pathing and full row counts in the
normal pathing). The gather node, realizing that the row count it is
dealing with has been divided by the parallel divisor, undoes the division
in order to provide the correct row count to the non-parallel executing
nodes above it.

So a subquery is only ever executed once in a path - but the number of
returned rows depends on the number of planned workers done under the
assumption that a query executed among 2 workers costs 1/2.4 the amount of
the same query done with just the leader. It is an independent factor
compared to everything else going on and so the multiplication can simply
be done first to the original row count.

<starts looking for confirmation in the code>

allpaths(dot)c(at)2974-2975 (generate_gather_paths)
(L: 2993 as well)
(L: 3167 - generate_useful_gather_paths)

rows =
cheapest_partial_path->rows * cheapest_partial_path->parallel_workers;

Shouldn't that be:

rows =
cheapest_partial_path->rows * (get_parallel_divisor(cheapest_partial_path))

?

David J.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-04-29 14:54:16 Re: fix cost subqueryscan wrong parallel cost
Previous Message Tom Lane 2022-04-29 14:02:18 Re: fix cost subqueryscan wrong parallel cost