Re: Boundary value check in lazy_tid_reaped()

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Boundary value check in lazy_tid_reaped()
Date: 2021-03-16 21:08:27
Message-ID: CAH2-WzkirUe0Dui2Kd9rSf8GiNQybFGXDT7P0mBvtwDwjSaH4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 14, 2021 at 4:22 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> BTW I got around to trying this idea out for a specialised
> bsearch_itemptr() using a wide comparator, over here:

Cool!

I have another thing that should be considered when we revisit this
area in the future: maybe we should structure the binary search to
lookup multiple TIDs at once -- probably all of the TIDs from a given
leaf page, taken together.

There is now an nbtree function used for tuple deletion (all tuple
deletion, not just bottom-up deletion) that works like this:
_bt_delitems_delete_check(). I suspect that it would make sense to
generalize it to do the same thing for regular VACUUM. Perhaps this
idea would have to be combined with other techniques to show a real
benefit. It would probably be necessary to sort the TIDs first (just
like index_delete_sort() does for the _bt_delitems_delete_check() code
today), but that's probably no big deal.

It is typical to have 200 - 400 TIDs on an nbtree leaf page without
using deduplication. And with deduplication enabled you can have as
many as 1300 TIDs on a single 8KiB nbtree leaf page. It's easy to
imagine something like GCC's __builtin_prefetch() (or maybe just more
predictable access patterns in our "batch binary search") making
everything much faster through batching. This will also naturally make
btvacuumpage() much less "branchy", since of course it will no longer
need to process the page one TID at a time -- that helps too.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2021-03-16 21:59:42 Re: Boundary value check in lazy_tid_reaped()
Previous Message Andres Freund 2021-03-16 20:00:40 Re: [HACKERS] PATCH: Batch/pipelining support for libpq