Re: [PoC] Improve dead tuple storage for lazy vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2021-07-10 12:11:20
Message-ID: CAD21AoAXkuV7+57AOZXDkVUDHu6zat_cqAKcXn2Te0j=18yhFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jul 10, 2021 at 2:17 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2021-07-07 20:46:38 +0900, Masahiko Sawada wrote:
> > Currently, the TIDs of dead tuples are stored in an array that is
> > collectively allocated at the start of lazy vacuum and TID lookup uses
> > bsearch(). There are the following challenges and limitations:
>
> > So I prototyped a new data structure dedicated to storing dead tuples
> > during lazy vacuum while borrowing the idea from Roaring Bitmap[2].
> > The authors provide an implementation of Roaring Bitmap[3] (Apache
> > 2.0 license). But I've implemented this idea from scratch because we
> > need to integrate it with Dynamic Shared Memory/Area to support
> > parallel vacuum and need to support ItemPointerData, 6-bytes integer
> > in total, whereas the implementation supports only 4-bytes integers.
> > Also, when it comes to vacuum, we neither need to compute the
> > intersection, the union, nor the difference between sets, but need
> > only an existence check.
> >
> > The data structure is somewhat similar to TIDBitmap. It consists of
> > the hash table and the container area; the hash table has entries per
> > block and each block entry allocates its memory space, called a
> > container, in the container area to store its offset numbers. The
> > container area is actually an array of bytes and can be enlarged as
> > needed. In the container area, the data representation of offset
> > numbers varies depending on their cardinality. It has three container
> > types: array, bitmap, and run.
>
> How are you thinking of implementing iteration efficiently for rtbm? The
> second heap pass needs that obviously... I think the only option would
> be to qsort the whole thing?

Yes, I'm thinking that the iteration of rtbm is somewhat similar to
tbm. That is, we iterate and collect hash table entries and do qsort
hash entries by the block number. Then fetch the entry along with its
container one by one in order of the block number.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2021-07-10 12:54:35 Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors
Previous Message Dean Rasheed 2021-07-10 11:54:15 pgsql: Fix numeric_mul() overflow due to too many digits after decimal