Re: fix cost subqueryscan wrong parallel cost

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Richard Guo <guofenglinux(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fix cost subqueryscan wrong parallel cost
Date: 2022-04-29 18:09:51
Message-ID: 343340.1651255791@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> So I think the actual problem here is that we leave the decision
> to elide no-op SubqueryScan nodes till setrefs.c. We should detect
> that earlier, and when it applies, label the SubqueryScanPath with
> the exact cost of its child.

Hmm ... actually, while doing that seems like it'd be a good idea,
it doesn't have much bearing on the case at hand. I approximated
the results of such a change on this test case by just removing
the "cpu_tuple_cost" component from cost_subqueryscan:

@@ -1420,7 +1420,7 @@ cost_subqueryscan(SubqueryScanPath *path, PlannerInfo *root,
get_restriction_qual_cost(root, baserel, param_info, &qpqual_cost);

startup_cost = qpqual_cost.startup;
- cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;
+ cpu_per_tuple = qpqual_cost.per_tuple;
run_cost = cpu_per_tuple * baserel->tuples;

/* tlist eval costs are paid per output row, not per tuple scanned */

and what I got was

Unique (cost=28144.85..29344.85 rows=120000 width=12)
-> Sort (cost=28144.85..28444.85 rows=120000 width=12)
Sort Key: "*SELECT* 1".a, "*SELECT* 1".b, "*SELECT* 1".c
-> Append (cost=0.00..1750.00 rows=120000 width=12)
-> Subquery Scan on "*SELECT* 1" (cost=0.00..575.00 rows=60000 width=12)
-> Gather (cost=0.00..575.00 rows=60000 width=12)
Workers Planned: 2
-> Parallel Seq Scan on t (cost=0.00..575.00 rows=25000 width=12)
-> Subquery Scan on "*SELECT* 2" (cost=0.00..575.00 rows=60000 width=12)
-> Gather (cost=0.00..575.00 rows=60000 width=12)
Workers Planned: 2
-> Parallel Seq Scan on t t_1 (cost=0.00..575.00 rows=25000 width=12)

The subqueryscans are being costed the way we want now, but it's still
going for the append-atop-gathers plan. So I dug a bit deeper, and
found that generate_union_paths does also create the gather-atop-append
plan, but *it's costed at 1750 units*, exactly the same as the other
path. So add_path is just making an arbitrary choice between two paths
of identical cost, and it happens to pick this one.

(If I don't make the above tweak to cost_subqueryscan, the same thing
happens, although the two competing paths now each have cost 2950.)

So: why do we come out with a cost of 1750 for the very same plan
that in the UNION ALL case is costed at 1400? AFAICS it's because
the UNION ALL case thinks that the two inputs of the parallel append
produce 25000 rows apiece so the parallel append produces 50000 rows,
and it costs the append's overhead on that basis. But in the UNION
case, the parallel append sees two inputs that are claimed to return
60000 rows, so the append produces 120000 rows, meaning more append
overhead. We can't readily get EXPLAIN to print this tree since
it's rejected by add_path, but what I'm seeing (with the above
costing hack) is

{GATHERPATH
:rows 120000
:startup_cost 0.00
:total_cost 1750.00
:subpath
{APPENDPATH
:parallel_aware true
:parallel_safe true
:parallel_workers 2
:rows 120000
:startup_cost 0.00
:total_cost 1750.00
:subpaths (
{SUBQUERYSCANPATH
:rows 60000
:startup_cost 0.00
:total_cost 575.00
:subpath
{PATH
:parallel_aware true
:parallel_safe true
:parallel_workers 2
:rows 25000
:startup_cost 0.00
:total_cost 575.00
}
}
{SUBQUERYSCANPATH
:rows 60000
:startup_cost 0.00
:total_cost 575.00
:subpath
{PATH
:parallel_aware true
:parallel_safe true
:parallel_workers 2
:rows 25000
:startup_cost 0.00
:total_cost 575.00
}
}
)
}
}

In short, these SubqueryScans are being labeled as producing 60000 rows
when their input only produces 25000 rows, which is surely insane.

So: even though the SubqueryScan itself isn't parallel-aware, the number
of rows it processes has to be de-rated according to the number of workers
involved. Perhaps something like the original patch in this thread is
indeed correct. But I can't help feeling that this definition of
path_rows is mighty unintuitive and error-prone, and that if
cost_subqueryscan is wrong then it's likely got lots of company.
Maybe we need to take two steps back and rethink the whole approach.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Cary Huang 2022-04-29 18:35:52 Re: allow specifying action when standby encounters incompatible parameter settings
Previous Message Andres Freund 2022-04-29 18:00:43 Re: [RFC] building postgres with meson -v8