Re: [PoC] Improve dead tuple storage for lazy vacuum

From: Andres Freund <andres(at)anarazel(dot)de>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2021-07-19 23:49:15
Message-ID: 20210719234915.5kw6k2k5jojvamc5@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2021-07-19 15:20:54 +0900, Masahiko Sawada wrote:
> BTW is the implementation of the radix tree approach available
> somewhere? If so I'd like to experiment with that too.
>
> >
> > I have toyed with implementing adaptively large radix nodes like
> > proposed in https://db.in.tum.de/~leis/papers/ART.pdf - but haven't
> > gotten it quite working.
>
> That seems promising approach.

I've since implemented some, but not all of the ideas of that paper
(adaptive node sizes, but not the tree compression pieces).

E.g. for

select prepare(
1000000, -- max block
20, -- # of dead tuples per page
10, -- dead tuples interval within a page
1 -- page inteval
);
attach size shuffled ordered
array 69 ms 120 MB 84.87 s 8.66 s
intset 173 ms 65 MB 68.82 s 11.75 s
rtbm 201 ms 67 MB 11.54 s 1.35 s
tbm 232 ms 100 MB 8.33 s 1.26 s
vtbm 162 ms 58 MB 10.01 s 1.22 s
radix 88 ms 42 MB 11.49 s 1.67 s

and for
select prepare(
1000000, -- max block
10, -- # of dead tuples per page
1, -- dead tuples interval within a page
1 -- page inteval
);

attach size shuffled ordered
array 24 ms 60MB 3.74s 1.02 s
intset 97 ms 49MB 3.14s 0.75 s
rtbm 138 ms 36MB 0.41s 0.14 s
tbm 198 ms 101MB 0.41s 0.14 s
vtbm 118 ms 27MB 0.39s 0.12 s
radix 33 ms 10MB 0.28s 0.10 s

(this is an almost unfairly good case for radix)

Running out of time to format the results of the other testcases before
I have to run, unfortunately. radix uses 42MB both in test case 3 and
4.

The radix tree code isn't good right now. A ridiculous amount of
duplication etc. The naming clearly shows its origins from a buffer
mapping radix tree...

Currently in a bunch of the cases 20% of the time is spent in
radix_reaped(). If I move that into radix.c and for bfm_lookup() to be
inlined, I get reduced overhead. rbtm for example essentially already
does that, because it does splitting of ItemPointer in rtbm.c.

I've attached my current patches against your tree.

Greetings,

Andres Freund

Attachment Content-Type Size
0001-Fix-build-warnings.patch text/x-diff 2.3 KB
0002-Add-radix-tree.patch text/x-diff 76.8 KB
0003-Add-radix-tree-benchmark-integration.patch text/x-diff 7.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2021-07-19 23:57:45 Re: Transactions and indexes
Previous Message Tom Lane 2021-07-19 23:43:07 Re: O_DIRECT on macOS