Re: Add GUC to tune glibc's malloc implementation.

From: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
To: pgsql-hackers(at)postgresql(dot)org, Peter Eisentraut <peter(at)eisentraut(dot)org>
Cc: tomas(dot)vondra(at)enterprisedb(dot)com
Subject: Re: Add GUC to tune glibc's malloc implementation.
Date: 2023-06-26 06:38:35
Message-ID: 2285512.ElGaqSPkdT@aivenlaptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le vendredi 23 juin 2023, 22:55:51 CEST Peter Eisentraut a écrit :
> On 22.06.23 15:35, Ronan Dunklau wrote:
> > The thing is, by default, those parameters are adjusted dynamically by the
> > glibc itself. It starts with quite small thresholds, and raises them when
> > the program frees some memory, up to a certain limit. This patch proposes
> > a new GUC allowing the user to adjust those settings according to their
> > workload.
> >
> > This can cause problems. Let's take for example a table with 10k rows, and
> > 32 columns (as defined by a bench script David Rowley shared last year
> > when discussing the GenerationContext for tuplesort), and execute the
> > following
> > query, with 32MB of work_mem:

> I don't follow what you are trying to achieve with this. The examples
> you show appear to work sensibly in my mind. Using this setting, you
> can save some of the adjustments that glibc does after the first query.
> But that seems only useful if your session only does one query. Is that
> what you are doing?

No, not at all: glibc does not do the right thing, we don't "save" it.
I will try to rephrase that.

In the first test case I showed, we see that glibc adjusts its threshold, but
to a suboptimal value since repeated executions of a query needing the same
amount of memory will release it back to the kernel, and move the brk pointer
again, and will not adjust it again. On the other hand, by manually adjusting
the thresholds, we can set them to a higher value which means that the memory
will be kept in malloc's freelist for reuse for the next queries. As shown in
the benchmark results I posted, this can have quite a dramatic effect, going
from 396 tps to 894. For ease of benchmarking, it is a single query being
executed over and over again, but the same thing would be true if different
queries allocating memories were executed by a single backend.

The worst part of this means it is unpredictable: depending on past memory
allocation patterns, glibc will end up in different states, and exhibit
completely different performance for all subsequent queries. In fact, this is
what Tomas noticed last year, (see [0]), which led to investigation into
this.

I also tried to show that for certain cases glibcs behaviour can be on the
contrary to greedy, and hold on too much memory if we just need the memory
once and never allocate it again.

I hope what I'm trying to achieve is clearer that way. Maybe this patch is not
the best way to go about this, but since the memory allocator behaviour can
have such an impact it's a bit sad we have to leave half the performance on
the table because of it when there are easily accessible knobs to avoid it.

[0] https://www.postgresql.org/message-id/bcdd4e3e-c12d-cd2b-7ead-a91ad416100a%40enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2023-06-26 06:47:57 Re: Parallelize correlated subqueries that execute within each worker
Previous Message Heikki Linnakangas 2023-06-26 05:50:09 Re: Inquiry/Help with pg_adviser (problem in index_create function for creating indexes)