Re: Avoid stack frame setup in performance critical routines using tail calls

From: Andres Freund <andres(at)anarazel(dot)de>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: Avoid stack frame setup in performance critical routines using tail calls
Date: 2024-02-22 22:53:43
Message-ID: 20240222225343.7a57fiv7quehhlur@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-02-23 00:46:26 +1300, David Rowley wrote:
> I've rebased the 0001 patch and gone over it again and made a few
> additional changes besides what I mentioned in my review.
>
> On Wed, 9 Aug 2023 at 20:44, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> > Here's a review of v2-0001:
> > 2. Why do you need to add the NULL check here?
> >
> > #ifdef USE_VALGRIND
> > - if (method != MCTX_ALIGNED_REDIRECT_ID)
> > + if (ret != NULL && method != MCTX_ALIGNED_REDIRECT_ID)
> > VALGRIND_MEMPOOL_CHANGE(context, pointer, ret, size);
> > #endif
>
> I removed this NULL check as we're calling the realloc function with
> no flags, so it shouldn't return NULL as it'll error out from any OOM
> errors.

That was probably a copy-paste issue...

> > 4. It would be good to see some API documentation in the
> > MemoryContextMethods struct. This adds a lot of responsibility onto
> > the context implementation without any extra documentation to explain
> > what, for example, palloc is responsible for and what the alloc
> > function needs to do itself.
>
> I've done that too.
>
> I also added header comments for MemoryContextAllocationFailure and
> MemoryContextSizeFailure and added some comments to explain in places
> like palloc() to warn people not to add checks after the 'alloc' call.
>
> The rebased patch is 0001 and all of my changes are in 0002. I will
> rebase your original 0002 patch later.

Thanks!

> I think 0001 is much more important, as evident by the reported benchmarks
> on this thread.

I agree that it's good to tackle 0001 first.

I don't understand the benchmark point though. Your benchmark seems to suggest
that 0002 improves aset performance by *more* than 0001: for 8 byte aset
allocs:

time
master: 8.86
0001: 8.12
0002: 7.02

So 0001 reduces time by 0.92x and 0002 by 0.86x.

John's test shows basically no change for 0002 - which is unsurprising, as
0002 changes aset.c, but the test seems to solely exercise slab, as only
SlabAlloc() shows up in the profile. As 0002 only touches aset.c it couldn't
really have affected that test.

> In absence of anyone else looking at this, I think it's ready to go.
> If anyone is following along and wants to review or test it, please do
> so soon.

Makes sense!

> @@ -1061,6 +1072,16 @@ MemoryContextAlloc(MemoryContext context, Size size)
>
> context->isReset = false;
>

For a moment this made me wonder if we could move the isReset handling into
the allocator slow paths as well - it's annoying to write that bit (and thus
dirty the cacheline) over and ove. But it'd be somewhat awkward due to
pre-allocated blocks. So that'd be a larger change better done separately.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-02-22 23:28:10 Re: Documentation to upgrade logical replication cluster
Previous Message Jeff Davis 2024-02-22 22:46:51 Re: Improve readability by using designated initializers when possible