Re: [PoC] Improve dead tuple storage for lazy vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2022-11-21 08:06:56
Message-ID: CAD21AoDyjZJ66hk9Hj7a7DypiMFFTbgR-Nke5OHh4Rt0oWs7Kg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 21, 2022 at 3:43 PM John Naylor
<john(dot)naylor(at)enterprisedb(dot)com> wrote:
>
> On Fri, Nov 18, 2022 at 8:20 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Thu, Nov 17, 2022 at 12:24 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Nov 16, 2022 at 4:39 PM John Naylor
> > > <john(dot)naylor(at)enterprisedb(dot)com> wrote:
>
> > > > That means my idea for the pointer struct might have some problems, at least as currently implemented. Maybe in the course of separating out and polishing that piece, an inefficiency will fall out. Or, it might be another reason to template local and shared separately. Not sure yet. I also haven't tried to adjust this test for the shared memory case.
>
> Digging a bit deeper, I see a flaw in my benchmark: Even though the total distribution of node kinds is decently even, the pattern that the benchmark sees is not terribly random:
>
> 3,343,352 branch-misses:u # 0.85% of all branches
> 393,204,959 branches:u
>
> Recall a previous benchmark [1] where the leaf node was about half node16 and half node32. Randomizing the leaf node between the two caused branch misses to go from 1% to 2%, causing a noticeable slowdown. Maybe in this new benchmark, each level has a skewed distribution of nodes, giving a smart branch predictor something to work with. We will need a way to efficiently generate keys that lead to a relatively unpredictable distribution of node kinds, as seen by a searcher. Especially in the leaves (or just above the leaves), since those are less likely to be cached.
>
> > > I'll also run the test on my environment and do the investigation tomorrow.
> > >
> >
> > FYI I've not tested the patch you shared today but here are the
> > benchmark results I did with the v9 patch in my environment (I used
> > the second filter). I splitted 0004 patch into two patches: a patch
> > for pure refactoring patch to introduce rt_node_ptr and a patch to do
> > pointer tagging.
>
> Would you be able to share the refactoring patch? And a fix for the failing tests? I'm thinking I want to try the templating approach fairly soon.
>

Sure. I've attached the v10 patches. 0004 is the pure refactoring
patch and 0005 patch introduces the pointer tagging.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v10-0003-tool-for-measuring-radix-tree-performance.patch application/octet-stream 17.8 KB
v10-0004-Use-rt_node_ptr-to-reference-radix-tree-nodes.patch application/octet-stream 53.5 KB
v10-0005-PoC-tag-the-node-kind-to-rt_pointer.patch application/octet-stream 2.6 KB
v10-0007-PoC-lazy-vacuum-integration.patch application/octet-stream 35.5 KB
v10-0006-PoC-DSA-support-for-radix-tree.patch application/octet-stream 37.7 KB
v10-0002-Add-radix-implementation.patch application/octet-stream 82.9 KB
v10-0001-introduce-vector8_min-and-vector8_highbit_mask.patch application/octet-stream 2.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message sirisha chamarthi 2022-11-21 08:16:20 Proposal: Allow user with pg_monitor role to call pg_stat_reset* functions
Previous Message Chris Travers 2022-11-21 07:58:00 Re: Add 64-bit XIDs into PostgreSQL 15