Re: decoupling table and index vacuum

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: decoupling table and index vacuum
Date: 2021-04-22 18:56:10
Message-ID: 20210422185610.35gjmmxtan2ooyrg@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2021-04-22 14:47:14 -0400, Robert Haas wrote:
> On Thu, Apr 22, 2021 at 10:28 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > Right. Given decoupling index vacuuming, I think the index’s garbage
> > statistics are important which preferably need to be fetchable without
> > accessing indexes. It would be not hard to estimate how many index
> > tuples might be able to be deleted by looking at the dead TID fork but
> > it doesn’t necessarily match the actual number.
>
> Right, and to appeal (I think) to Peter's quantitative vs. qualitative
> principle, it could be way off. Like, we could have a billion dead
> TIDs and in one index the number of index entries that need to be
> cleaned out could be 1 billion and in another index it could be zero
> (0). We know how much data we will need to scan because we can fstat()
> the index, but there seems to be no easy way to estimate how many of
> those pages we'll need to dirty, because we don't know how successful
> previous opportunistic cleanup has been.

That aspect seems reasonably easy to fix: We can start to report the
number of opportunistically deleted index entries to pgstat. At least
nbtree already performs the actual deletion in bulk and we already have
(currently unused) space in the pgstat entries for it, so I don't think
it'd meanginfully increase overhead. And it'd improve insight in how
indexes operate significantly, even leaving autovacuum etc aside.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-04-22 18:57:24 Re: posgres 12 bug (partitioned table)
Previous Message Robert Haas 2021-04-22 18:47:14 Re: decoupling table and index vacuum