Re: fix cost subqueryscan wrong parallel cost

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "bucoo(at)sohu(dot)com" <bucoo(at)sohu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fix cost subqueryscan wrong parallel cost
Date: 2022-04-15 09:16:44
Message-ID: CAMbWs48qqCgwKrJpyf5rSRx-wNrTk06dcC2jTN=sbg=gRR6a7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 15, 2022 at 12:50 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Tue, Apr 12, 2022 at 2:57 AM bucoo(at)sohu(dot)com <bucoo(at)sohu(dot)com> wrote:
> > The cost_subqueryscan function does not judge whether it is parallel.
>
> I don't see any reason why it would need to do that. A subquery scan
> isn't parallel aware.
>
> > regress
> > -- Incremental sort vs. set operations with varno 0
> > set enable_hashagg to off;
> > explain (costs off) select * from t union select * from t order by 1,3;
> > QUERY PLAN
> > ----------------------------------------------------------
> > Incremental Sort
> > Sort Key: t.a, t.c
> > Presorted Key: t.a
> > -> Unique
> > -> Sort
> > Sort Key: t.a, t.b, t.c
> > -> Append
> > -> Gather
> > Workers Planned: 2
> > -> Parallel Seq Scan on t
> > -> Gather
> > Workers Planned: 2
> > -> Parallel Seq Scan on t t_1
> > to
> > Incremental Sort
> > Sort Key: t.a, t.c
> > Presorted Key: t.a
> > -> Unique
> > -> Sort
> > Sort Key: t.a, t.b, t.c
> > -> Gather
> > Workers Planned: 2
> > -> Parallel Append
> > -> Parallel Seq Scan on t
> > -> Parallel Seq Scan on t t_1
> > Obviously the latter is less expensive
>
> Generally it should be. But there's no subquery scan visible here.
>

The paths of subtrees in set operations would be type of subqueryscan.
The SubqueryScan nodes are removed later in set_plan_references() in
this case as they are considered as being trivial.

>
> There may well be something wrong here, but I don't think that you've
> diagnosed the problem correctly, or explained it clearly.
>

Some debugging work shows that the second path is generated but then
fails when competing with the first path. So if there is something
wrong, I think cost calculation is the suspicious point.

Not related to this topic but I noticed another problem from the plan.
Note the first Sort node which is to unique-ify the result of the UNION.
Why cannot we re-arrange the sort keys from (a, b, c) to (a, c, b) so
that we can avoid the second Sort node?

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message bucoo@sohu.com 2022-04-15 10:06:25 Re: Re: fix cost subqueryscan wrong parallel cost
Previous Message Kyotaro Horiguchi 2022-04-15 08:29:13 Re: BufferAlloc: don't take two simultaneous locks