Re: [PoC] Improve dead tuple storage for lazy vacuum

From: John Naylor <johncnaylorls(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2024-03-05 09:41:30
Message-ID: CANWCAZZEx5rYeYKJjvogZxyiKEQ3XbuRRubQycKycj_hBf_5cw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 5, 2024 at 8:27 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Mon, Mar 4, 2024 at 8:48 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
> >
> > On Mon, Mar 4, 2024 at 1:05 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:

> > > Resetting the tree->context seems to work. But I think we should note
> > > for callers that the dsa_area passed to RT_CREATE should be created in
> > > a different context than the context passed to RT_CREATE because
> > > otherwise RT_FREE() will also free the dsa_area. For example, the
> > > following code in test_radixtree.c will no longer work:

I've added a comment in v66-0004, which contains a number of other
small corrections and edits.

On Fri, Mar 1, 2024 at 3:01 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > Thirdly, cosmetic: With the introduction of single-value leaves, it
> > seems we should do s/RT_NODE_PTR/RT_CHILD_PTR/ -- what do you think?
>
> Agreed.

Done in v66-0005.

v66-0006 removes outdated tests for invalid root that somehow got left over.

> > I wrote:
> > > > Secondly, I thought about my recent work to skip checking if we first
> > > > need to create a root node, and that has a harmless (for vacuum at
> > > > least) but slightly untidy behavior: When RT_SET is first called, and
> > > > the key is bigger than 255, new nodes will go on top of the root node.
> > > > These have chunk '0'. If all subsequent keys are big enough, the
> > > > orginal root node will stay empty. If all keys are deleted, there will
> > > > be a chain of empty nodes remaining. Again, I believe this is
> > > > harmless, but to make tidy, it should easy to teach RT_EXTEND_UP to
> > > > call out to RT_EXTEND_DOWN if it finds the tree is empty. I can work
> > > > on this, but likely not today.

> > I put a little more work into this, and got it working, just needs a
> > small amount of finicky coding. I'll share tomorrow.

Done in v66-0007. I'm a bit disappointed in the extra messiness this
adds, although it's not a lot.

> > + check_stack_depth();
> > + CHECK_FOR_INTERRUPTS();
> >
> > I'm not sure why these are here: The first seems overly paranoid,
> > although harmless, but the second is probably a bad idea. Why should
> > the user be able to to interrupt the freeing of memory?
>
> Good catch. We should not check the interruption there.

Removed in v66-0008.

> > Also, I'm not quite happy that RT_ITER has a copy of a pointer to the
> > tree, leading to coding like "iter->tree->ctl->root". I *think* it
> > would be easier to read if the tree was a parameter to these iteration
> > functions. That would require an API change, so the tests/tidstore
> > would have some churn. I can do that, but before trying I wanted to
> > see what you think -- is there some reason to keep the current way?
>
> I considered both usages, there are two reasons for the current style.
> I'm concerned that if we pass both the tree and RT_ITER to iteration
> functions, the caller could mistakenly pass a different tree than the
> one that was specified to create the RT_ITER. And the second reason is
> just to make it consistent with other data structures such as
> dynahash.c and dshash.c, but I now realized that in simplehash.h we
> pass both the hash table and the iterator.

Okay, then I don't think it's worth messing with at this point.

On Tue, Feb 6, 2024 at 9:58 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Fri, Feb 2, 2024 at 8:47 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:

> > It's pretty hard to see what test_pattern() is doing, or why it's
> > useful. I wonder if instead the test could use something like the
> > benchmark where random integers are masked off. That seems simpler. I
> > can work on that, but I'd like to hear your side about test_pattern().
>
> Yeah, test_pattern() is originally created for the integerset so it
> doesn't necessarily fit the radixtree. I agree to use some tests from
> benchmarks.

Done in v66-0009. I'd be curious to hear any feedback. I like the
aspect that the random numbers come from a different seed every time
the test runs.

v66-0010/0011 run pgindent, the latter with one typedef added for the
test module. 0012 - 0017 are copied from v65, and I haven't done any
work on tidstore or vacuum, except for squashing most v65 follow-up
patches.

I'd like to push 0001 and 0002 shortly, and then do another sweep over
0003, with remaining feedback, and get that in so we get some
buildfarm testing before the remaining polishing work on
tidstore/vacuum.

Attachment Content-Type Size
v66-ART.tar.gz application/gzip 75.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2024-03-05 09:44:37 Re: Refactoring backend fork+exec code
Previous Message Andrei Lepikhov 2024-03-05 09:36:29 Re: "type with xxxx does not exist" when doing ExecMemoize()