Re: PROC_IN_ANALYZE stillborn 13 years ago

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: PROC_IN_ANALYZE stillborn 13 years ago
Date: 2020-08-07 18:41:41
Message-ID: 2793035.1596825701@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Thinking about it more, there are really two ways to think about an
> estimated row count.

> On the one hand, if you think of the row count estimate as the number
> of rows that are going to pop out of a node, then it's always right to
> think of a unique index as limiting the number of occurrences of a
> given value to 1. But, if you think of the row count estimate as a way
> of estimating the amount of work that the node has to do to produce
> that output, then it isn't.

The planner intends its row counts to be interpreted in the first way.
We do have a rather indirect way of accounting for the cost of scanning
dead tuples and such, which is that we scale scanning costs according
to the measured physical size of the relation. That works better for
I/O costs than it does for CPU costs, but it's not completely useless
for the latter. In any case, we'd certainly not want to increase the
scan's row count estimate for that, because that would falsely inflate
our estimate of how much work upper plan levels have to do. Whatever
happens at the scan level, the upper levels aren't going to see those
dead tuples.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Zhang 2020-08-07 19:53:38 Re: Add LWLock blocker(s) information
Previous Message Robert Haas 2020-08-07 18:15:35 Re: Parallel worker hangs while handling errors.