Re: Boundary value check in lazy_tid_reaped()

From: Hannu Krosing <hannuk(at)google(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Boundary value check in lazy_tid_reaped()
Date: 2021-03-16 21:59:42
Message-ID: CAMT0RQSiRziFRnvWsZOdmcYLA6muBsEUr2O06UwhWUbX6LsahA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

We could also go parallel in another direction - I have been mulling about
writing a "vectorized" bsearch which would use AVX2, where you look up 64
(or more likely 32, so tids also fit in 256bit vector) tids at a time.

The trickiest part is that the search can complete at different iteration
for each value.

Of course it is possible that this has a very bad RAM access behaviour and
is no win at all even if it otherways works.

--
Hannu Krosing

On Tue, Mar 16, 2021 at 10:08 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

> On Sun, Mar 14, 2021 at 4:22 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
> wrote:
> > BTW I got around to trying this idea out for a specialised
> > bsearch_itemptr() using a wide comparator, over here:
>
> Cool!
>
> I have another thing that should be considered when we revisit this
> area in the future: maybe we should structure the binary search to
> lookup multiple TIDs at once -- probably all of the TIDs from a given
> leaf page, taken together.
>
> There is now an nbtree function used for tuple deletion (all tuple
> deletion, not just bottom-up deletion) that works like this:
> _bt_delitems_delete_check(). I suspect that it would make sense to
> generalize it to do the same thing for regular VACUUM. Perhaps this
> idea would have to be combined with other techniques to show a real
> benefit. It would probably be necessary to sort the TIDs first (just
> like index_delete_sort() does for the _bt_delitems_delete_check() code
> today), but that's probably no big deal.
>
> It is typical to have 200 - 400 TIDs on an nbtree leaf page without
> using deduplication. And with deduplication enabled you can have as
> many as 1300 TIDs on a single 8KiB nbtree leaf page. It's easy to
> imagine something like GCC's __builtin_prefetch() (or maybe just more
> predictable access patterns in our "batch binary search") making
> everything much faster through batching. This will also naturally make
> btvacuumpage() much less "branchy", since of course it will no longer
> need to process the page one TID at a time -- that helps too.
>
> --
> Peter Geoghegan
>
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2021-03-16 22:20:43 Re: New IndexAM API controlling index vacuum strategies
Previous Message Peter Geoghegan 2021-03-16 21:08:27 Re: Boundary value check in lazy_tid_reaped()