Re: fix cost subqueryscan wrong parallel cost

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Richard Guo <guofenglinux(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fix cost subqueryscan wrong parallel cost
Date: 2022-05-03 18:13:54
Message-ID: 2174194.1651601634@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> That I don't like at all. I'm still of the opinion that it's a huge
> mistake for EXPLAIN to print int(rowcount/loops) instead of just
> rowcount. The division is never what I want and in my experience is
> also not what other people want and often causes confusion. Both the
> division and the rounding lose information about precisely what row
> count was estimated, which makes it harder to figure out where in the
> plan things went wrong.

I'm inclined to look at it a bit differently: it was a mistake to
use the same "loops" notion for parallelism as for repeated node
execution. But I think we are saying the same thing in one respect,
namely it'd be better if what EXPLAIN shows for parallelism were totals
across all workers rather than per-worker numbers. (I'm unconvinced
about whether repeated node execution ought to work like that.)

> I am not at all keen on adding more ways for
> what we print out to be different from the information actually stored
> in the plan tree.

I think the cost estimation functions want to work with per-worker
rowcounts. We could scale that up to totals when we create the
finished plan tree, perhaps.

> I don't know for sure what we ought to be storing in
> the plan tree, but I think whatever we store should also be what we
> print. I think the fact that we've chosen to store something in the
> plan tree is strong evidence that that exact value, and not some
> quantity derived therefrom, is what's interesting.

The only reason we store any of this in the plan tree is for
EXPLAIN to print it out. On the other hand, I don't want the
planner expending any large number of cycles modifying the numbers
it works with before putting them in the plan tree, because most
of the time we're not doing EXPLAIN so it'd be wasted effort.

In any case, fundamental redesign of what EXPLAIN prints is a job
for v16 or later. Are you okay with the proposed patch as a v15 fix?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-05-03 18:20:25 Re: failures in t/031_recovery_conflict.pl on CI
Previous Message Robert Haas 2022-05-03 17:13:37 Re: Configuration Parameter/GUC value validation hook