Re: Huge backend memory footprint

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Huge backend memory footprint
Date: 2017-12-22 13:13:30
Message-ID: CAGTBQpY8XtV_DTtOYcOSARL0Zbt7B=NsvLWi1_uHaPMF6t5vrg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 22, 2017 at 10:07 AM, Konstantin Knizhnik <
k(dot)knizhnik(at)postgrespro(dot)ru> wrote:

> While my experiments with pthreads version of Postgres I find out that I
> can not create more than 100k backends even at the system with 4Tb of RAM.
> I do not want to discuss now the idea of creating so large number of
> backends - yes, most of the real production systems are using pgbouncer or
> similar connection pooling
> tool allowing to restrict number of connections to the database. But there
> are 144 cores at this system and if we want to utilize all system resources
> then optimal number of
> backends will be several hundreds (especially taken in account that
> Postgres backends are usually not CPU bounded and have to read data from
> the disk, so number of backends
> should be much larger than number of cores).
>
> There are several per-backend arrays in postgres which size depends on
> maximal number of backends.
> For max_connections=100000 Postgres allocates 26Mb for each snapshot:
>
> CurrentRunningXacts->xids = (TransactionId *)
> malloc(TOTAL_MAX_CACHED_SUBXIDS * sizeof(TransactionId));
>
> It seems to be too overestimated value, because TOTAL_MAX_CACHED_SUBXIDS
> is defined as:
>
> /*
> * During Hot Standby processing we have a data structure called
> * KnownAssignedXids, created in shared memory. Local data structures
> are
> * also created in various backends during GetSnapshotData(),
> * TransactionIdIsInProgress() and GetRunningTransactionData(). All of
> the
> * main structures created in those functions must be identically
> sized,
> * since we may at times copy the whole of the data structures around.
> We
> * refer to this size as TOTAL_MAX_CACHED_SUBXIDS.
> *
> * Ideally we'd only create this structure if we were actually doing
> hot
> * standby in the current run, but we don't know that yet at the time
> * shared memory is being set up.
> */
> #define TOTAL_MAX_CACHED_SUBXIDS \
> ((PGPROC_MAX_CACHED_SUBXIDS + 1) * PROCARRAY_MAXPROCS)
>
>
> Another 12Mb array is used for deadlock detection:
>
> #2 0x00000000008ac397 in InitDeadLockChecking () at deadlock.c:196
> 196 (EDGE *) palloc(maxPossibleConstraints * sizeof(EDGE));
> (gdb) list
> 191 * last MaxBackends entries in possibleConstraints[] are
> reserved as
> 192 * output workspace for FindLockCycle.
> 193 */
> 194 maxPossibleConstraints = MaxBackends * 4;
> 195 possibleConstraints =
> 196 (EDGE *) palloc(maxPossibleConstraints * sizeof(EDGE));
> 197
>
>
> As result amount of dynamic memory allocated for each backend exceeds
> 50Mb and so 100k backends can not be launched even at the system with 4Tb!
> I think that we should use more accurate allocation policy in this places
> and do not waste memory in such manner (even if it is virtual).
>

Don't forget each thread also has its own stack. I don't think you can
expect 100k threads to ever work.

If you get to that point, you really need to consider async query
execution. There was a lot of work related to that in other threads, you
may want to take a look.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-12-22 13:21:02 Re: Huge backend memory footprint
Previous Message Konstantin Knizhnik 2017-12-22 13:07:23 Huge backend memory footprint