Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tv(at)fuzzy(dot)cz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Date: 2023-01-18 22:37:20
Message-ID: CAH2-Wz=Z5W6YpZU5=m-Qh-DaYV9k9qK=dbxMSRBhbc75T1ZZQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 18, 2023 at 2:22 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> The problem with the change is here:
>
> /*
> * Okay, we've covered the corner cases. The normal calculation is to
> * convert the old measurement to a density (tuples per page), then
> * estimate the number of tuples in the unscanned pages using that figure,
> * and finally add on the number of tuples in the scanned pages.
> */
> old_density = old_rel_tuples / old_rel_pages;
> unscanned_pages = (double) total_pages - (double) scanned_pages;
> total_tuples = old_density * unscanned_pages + scanned_tuples;
> return floor(total_tuples + 0.5);

My assumption has always been that vac_estimate_reltuples() is prone
to issues like this because it just doesn't have access to very much
information each time it runs. It can only see the delta between what
VACUUM just saw, and what the last VACUUM (or possibly the last
ANALYZE) saw according to pg_class. You're always going to find
weaknesses in such a model if you go looking for them. You're always
going to find a way to salami slice your way from good information to
total nonsense, if you pick the right/wrong test case, which runs
VACUUM in a way that allows whatever bias there may be to accumulate.
It's sort of like the way floating point values can become very
inaccurate through a process that allows many small inaccuracies to
accumulate over time.

Maybe you're right to be concerned to the degree that you're concerned
-- I'm not sure. I'm just adding what I see as important context.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-01-18 22:55:46 Re: Rethinking the implementation of ts_headline()
Previous Message Andres Freund 2023-01-18 22:22:39 Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation