Re: Should heapam_estimate_rel_size consider fillfactor?

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Peter Eisentraut <peter(at)eisentraut(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Should heapam_estimate_rel_size consider fillfactor?
Date: 2023-07-03 17:54:19
Message-ID: 2146d2f2-b445-3669-e231-4a2407f5bb9c@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/3/23 11:40, Tomas Vondra wrote:
> ...
>
> FWIW the reason why the integer division is intentional is most likely
> that we want "floor" semantics - if there's 10.23 rows per page, that
> really means 10 rows per page.
>
> I doubt it makes a huge difference in this particular place, considering
> we're calculating the estimate from somewhat unreliable values, and then
> use it for rough estimate of relation size.
>
> But from this POV, I think it's more correct to do it "my" way:
>
> density = (usable_bytes_per_page * fillfactor / 100) / tuple_width;
>
> because that's doing *two* separate integer divisions, with floor
> semantics. First we calculate "usable bytes" (rounded down), then
> average number of rows per page (also rounded down).
>
> Corey's formula would do just one integer division. I don't think it
> makes a huge difference, though. I mean, it's just an estimate and so we
> can't expect to be 100% accurate.
>

Pushed, using the formula with two divisions (as in the original patch).

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tristan Partin 2023-07-03 17:54:50 Re: Optionally using a better backtrace library?
Previous Message Álvaro Herrera 2023-07-03 17:46:27 Re: Does a cancelled REINDEX CONCURRENTLY need to be messy?