Re: vacuum -vs reltuples on insert only index

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: vacuum -vs reltuples on insert only index
Date: 2020-11-02 18:03:29
Message-ID: CAH2-WzkN7ofNX_99q3w44vDqaz=CRNmSF5V7R7yWrHfmprRX-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 23, 2020 at 11:10 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> I suspect that we need to move in this direction within nbtree. I'm a
> bit concerned about the partial index problem, though. I suppose maybe
> we could do it the old way (which won't account for posting list
> tuples) during cleanup as you suggest, but only use the final figure
> when it turns out to have been a partial indexes. For other indexes we
> could do what GIN does here.

Actually, it seems better to always count num_index_tuples the old way
during cleanup-only index VACUUMs, despite the inaccuracy that that
creates with posting list tuples. The inaccuracy is at least a fixed
and relatively small inaccuracy, since nbtree doesn't have posting
list compression or a pending list mechanism (unlike GIN). This
approach avoids calculating a num_index_tuples value that is less than
the number of distinct values in the index, which seems important.
Taking a more sophisticated approach seems unnecessary, especially
given that we need something that can be backpatched to Postgres 13.

Attached is my proposed fix, which takes this approach. I will commit
this on Wednesday or Thursday, barring any objections.

Thanks
--
Peter Geoghegan

Attachment Content-Type Size
v1-0001-Avoid-nbtree-cleanup-only-VACUUM-stats-inaccuraci.patch application/octet-stream 2.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Anastasia Lubennikova 2020-11-02 18:05:27 Re: WIP: BRIN multi-range indexes
Previous Message Stephen Frost 2020-11-02 18:01:59 Re: Disable WAL logging to speed up data loading