Re: [PoC] Improve dead tuple storage for lazy vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2022-11-18 13:20:10
Message-ID: CAD21AoDKyKKUoxca1UT4mz59e1iTWmhcdQFKFC2CkRrE=nXkVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 17, 2022 at 12:24 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Wed, Nov 16, 2022 at 4:39 PM John Naylor
> <john(dot)naylor(at)enterprisedb(dot)com> wrote:
> >
> >
> > On Wed, Nov 16, 2022 at 12:33 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Nov 16, 2022 at 1:46 PM John Naylor
> > > <john(dot)naylor(at)enterprisedb(dot)com> wrote:
> > > >
> > > >
> > > > On Tue, Nov 15, 2022 at 11:59 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > > Thanks! Please let me know if there is something I can help with.
> > > >
> > > > I didn't get very far because the tests fail on 0004 in rt_verify_node:
> > > >
> > > > TRAP: failed Assert("n4->chunks[i - 1] < n4->chunks[i]"), File: "../src/backend/lib/radixtree.c", Line: 2186, PID: 18242
> > >
> > > Which tests do you use to get this assertion failure? I've confirmed
> > > there is a bug in 0005 patch but without it, "make check-world"
> > > passed.
> >
> > Hmm, I started over and rebuilt and it didn't reproduce. Not sure what happened, sorry for the noise.
>
> Good to know. No problem.
>
> > I'm attaching a test I wrote to stress test branch prediction in search, and while trying it out I found two possible issues.
>
> Thank you for testing!
>
> >
> > It's based on the random int load test, but tests search speed. Run like this:
> >
> > select * from bench_search_random_nodes(10 * 1000 * 1000)
> >
> > It also takes some care to include all the different node kinds, restricting the possible keys by AND-ing with a filter. Here's a simple demo:
> >
> > filter = ((uint64)1<<40)-1;
> > LOG: num_keys = 9999967, height = 4, n4 = 17513814, n32 = 6320, n128 = 62663, n256 = 3130
> >
> > Just using random integers leads to >99% using the smallest node. I wanted to get close to having the same number of each, but that's difficult while still using random inputs. I ended up using
> >
> > filter = (((uint64) 0x7F<<32) | (0x07<<24) | (0xFF<<16) | 0xFF)
> >
> > which gives
> >
> > LOG: num_keys = 9291812, height = 4, n4 = 262144, n32 = 79603, n128 = 182670, n256 = 1024
> >
> > Which seems okay for the task. One puzzling thing I found while trying various filters is that sometimes the reported tree height would change. For example:
> >
> > filter = (((uint64) 1<<32) | (0xFF<<24));
> > LOG: num_keys = 9999944, height = 7, n4 = 47515559, n32 = 6209, n128 = 62632, n256 = 3161
> >
> > 1) Any idea why the tree height would be reported as 7 here? I didn't expect that.
>
> In my environment, (0xFF<<24) is 0xFFFFFFFFFF000000, not 0xFF000000.
> It seems the filter should be (((uint64) 1<<32) | ((uint64)
> 0xFF<<24)).
>
> >
> > 2) It seems that 0004 actually causes a significant slowdown in this test (as in the attached, using the second filter above and with turboboost disabled):
> >
> > v9 0003: 2062 2051 2050
> > v9 0004: 2346 2316 2321
> >
> > That means my idea for the pointer struct might have some problems, at least as currently implemented. Maybe in the course of separating out and polishing that piece, an inefficiency will fall out. Or, it might be another reason to template local and shared separately. Not sure yet. I also haven't tried to adjust this test for the shared memory case.
>
> I'll also run the test on my environment and do the investigation tomorrow.
>

FYI I've not tested the patch you shared today but here are the
benchmark results I did with the v9 patch in my environment (I used
the second filter). I splitted 0004 patch into two patches: a patch
for pure refactoring patch to introduce rt_node_ptr and a patch to do
pointer tagging.

v9 0003 patch : 1113 1114 1114
introduce rt_node_ptr: 1127 1128 1128
pointer tagging : 1085 1087 1086 (equivalent to 0004 patch)

In my environment, rt_node_ptr seemed to lead some overhead but
pointer tagging had performance benefits. I'm not sure the reason why
the results are different from yours. The radix tree stats shows the
same as your tests.

=# select * from bench_search_random_nodes(10 * 1000 * 1000);
2022-11-18 22:18:21.608 JST [3913544] LOG: num_keys = 9291812, height
= 4, n4 = 262144, n32 =79603, n128 = 182670, n256 = 1024

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stavros Koureas 2022-11-18 13:26:25 Logical Replication Custom Column Expression
Previous Message Satya Thirumani 2022-11-18 13:08:56 Unable to reset stats using pg_stat_reset_replication_slot