Re: Use generation context to speed up tuplesorts

From: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
To: David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: Use generation context to speed up tuplesorts
Date: 2021-12-17 14:00:25
Message-ID: 4776839.iZASKD2KPV@aivenronan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le vendredi 17 décembre 2021, 14:39:10 CET Tomas Vondra a écrit :
> I wasn't really suggesting to investigate those other allocators in this
> patch - it seems like a task requiring a pretty significant amount of
> work/time. My point was that we should make it reasonably easy to add
> tweaks for those other environments, if someone is interested enough to
> do the legwork.
>
> >> 2) In fact, I wonder if different glibc versions behave differently?
> >> Hopefully it's not changing that much, though. Ditto kernel versions,
> >> but the mmap/sbrk interface is likely more stable. We can test this.
> >
> > That could be tested, yes. As a matter of fact, a commit removing the
> > upper
> > limit for MALLOC_MMAP_THRESHOLD has just been committed yesterday to
> > glibc,
> > which means we can service much bigger allocation without mmap.
>
> Yeah, I noticed that commit too. Most systems stick to one glibc
> version, so it'll take time to reach most systems. Let's continue with
> just one glibc version and then maybe test other versions.

Yes, I also need to figure out how to detect we're using glibc as I'm not very
familiar with configure.

>
> >> 3) If we bump the thresholds, won't that work against reusing the
> >> memory? I mean, if we free a whole block (from any allocator we have),
> >> glibc might return it to kernel, depending on mmap threshold value. It's
> >> not guaranteed, but increasing the malloc thresholds will make that even
> >> less likely. So we might just as well increase the minimum block size,
> >> with about the same effect, no?
> >
> > It is my understanding that malloc will try to compact memory by moving it
> > around. So the memory should be actually be released to the kernel at some
> > point. In the meantime, malloc can reuse it for our next invocation (which
> > can be in a different memory context on our side).
> >
> > If we increase the minimum block size, this is memory we will actually
> >
> > reserve, and it will not protect us against the ramping-up behaviour:
> > - the first allocation of a big block may be over mmap_threshold, and
> > serviced>
> > by an expensive mmap
> >
> > - when it's free, the threshold is doubled
> > - next invocation is serviced by an sbrk call
> > - freeing it will be above the trim threshold, and it will be returned.
> >
> > After several "big" allocations, the thresholds will raise to their
> > maximum
> > values (well, it used to, I need to check what happens with that latest
> > patch of glibc...)
> >
> > This will typically happen several times as malloc doubles the threshold
> > each time. This is probably the reason quadrupling the block sizes was
> > more effective.
>
> Hmmm, OK. Can we we benchmark the case with large initial block size, at
> least for comparison?

The benchmark I called "fixed" was with a fixed block size of
ALLOCSET_DEFAULT_MAXSIZE (first proposed patch) and showed roughly the same
performance profile as the growing blocks + malloc tuning.
But if I understand correctly, you implemented the growing blocks logic after
concerns about wasting memory with a constant large block size. If we tune
malloc, that memory would not be wasted if we don't alloc it, just not
released as eagerly when it's allocated.

Or do you want a benchmark with an even bigger initial block size ? With the
growing blocks patch with a large initial size ? I can run either, I just want
to understand what is interesting to you.

One thing that would be interesting would be to trace the total amount of
memory allocated in the different cases. This is something I will need to do
anyway when I propose that patch;

Best regards,

--
Ronan Dunklau

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2021-12-17 14:08:53 Re: Adding CI to our tree
Previous Message Tomas Vondra 2021-12-17 13:39:10 Re: Use generation context to speed up tuplesorts