Re: Estimating HugePages Requirements?

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Don Seiler <don(at)seiler(dot)us>
Cc: P C <puravc(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Estimating HugePages Requirements?
Date: 2021-08-09 23:58:53
Message-ID: 20210809235852.GA2426@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

On Thu, Jun 10, 2021 at 07:23:33PM -0500, Justin Pryzby wrote:
> On Wed, Jun 09, 2021 at 10:55:08PM -0500, Don Seiler wrote:
> > On Wed, Jun 9, 2021, 21:03 P C <puravc(at)gmail(dot)com> wrote:
> >
> > > I agree, its confusing for many and that confusion arises from the fact
> > > that you usually talk of shared_buffers in MB or GB whereas hugepages have
> > > to be configured in units of 2mb. But once they understand they realize its
> > > pretty simple.
> > >
> > > Don, we have experienced the same not just with postgres but also with
> > > oracle. I havent been able to get to the root of it, but what we usually do
> > > is, we add another 100-200 pages and that works for us. If the SGA or
> > > shared_buffers is high eg 96gb, then we add 250-500 pages. Those few
> > > hundred MBs may be wasted (because the moment you configure hugepages, the
> > > operating system considers it as used and does not use it any more) but
> > > nowadays, servers have 64 or 128 gb RAM easily and wasting that 500mb to
> > > 1gb does not hurt really.
> >
> > I don't have a problem with the math, just wanted to know if it was
> > possible to better estimate what the actual requirements would be at
> > deployment time. My fallback will probably be you did and just pad with an
> > extra 512MB by default.
>
> It's because the huge allocation isn't just shared_buffers, but also
> wal_buffers:
>
> | The amount of shared memory used for WAL data that has not yet been written to disk.
> | The default setting of -1 selects a size equal to 1/32nd (about 3%) of shared_buffers, ...
>
> .. and other stuff:

I wonder if this shouldn't be solved the other way around:

Define shared_buffers as the exact size to be allocated/requested from the OS
(regardless of whether they're huge pages or not), and have postgres compute
everything else based on that. So shared_buffers=2GB would end up being 1950MB
(or so) of buffer cache. We'd have to check that after the other allocations,
there's still at least 128kB left for the buffer cache. Maybe we'd have to
bump the minimum value of shared_buffers.

> src/backend/storage/ipc/ipci.c
> * Size of the Postgres shared-memory block is estimated via
> * moderately-accurate estimates for the big hogs, plus 100K for the
> * stuff that's too small to bother with estimating.
> *
> * We take some care during this phase to ensure that the total size
> * request doesn't overflow size_t. If this gets through, we don't
> * need to be so careful during the actual allocation phase.
> */
> size = 100000;
> size = add_size(size, PGSemaphoreShmemSize(numSemas));
> size = add_size(size, SpinlockSemaSize());
> size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
> sizeof(ShmemIndexEnt)));
> size = add_size(size, dsm_estimate_size());
> size = add_size(size, BufferShmemSize());
> size = add_size(size, LockShmemSize());
> size = add_size(size, PredicateLockShmemSize());
> size = add_size(size, ProcGlobalShmemSize());
> size = add_size(size, XLOGShmemSize());
> size = add_size(size, CLOGShmemSize());
> size = add_size(size, CommitTsShmemSize());
> size = add_size(size, SUBTRANSShmemSize());
> size = add_size(size, TwoPhaseShmemSize());
> size = add_size(size, BackgroundWorkerShmemSize());
> size = add_size(size, MultiXactShmemSize());
> size = add_size(size, LWLockShmemSize());
> size = add_size(size, ProcArrayShmemSize());
> size = add_size(size, BackendStatusShmemSize());
> size = add_size(size, SInvalShmemSize());
> size = add_size(size, PMSignalShmemSize());
> size = add_size(size, ProcSignalShmemSize());
> size = add_size(size, CheckpointerShmemSize());
> size = add_size(size, AutoVacuumShmemSize());
> size = add_size(size, ReplicationSlotsShmemSize());
> size = add_size(size, ReplicationOriginShmemSize());
> size = add_size(size, WalSndShmemSize());
> size = add_size(size, WalRcvShmemSize());
> size = add_size(size, PgArchShmemSize());
> size = add_size(size, ApplyLauncherShmemSize());
> size = add_size(size, SnapMgrShmemSize());
> size = add_size(size, BTreeShmemSize());
> size = add_size(size, SyncScanShmemSize());
> size = add_size(size, AsyncShmemSize());
> #ifdef EXEC_BACKEND
> size = add_size(size, ShmemBackendArraySize());
> #endif
>
> /* freeze the addin request size and include it */
> addin_request_allowed = false;
> size = add_size(size, total_addin_request);
>
> /* might as well round it off to a multiple of a typical page size */
> size = add_size(size, 8192 - (size % 8192));
>
> BTW, I think it'd be nice if this were a NOTICE:
> | elog(DEBUG1, "mmap(%zu) with MAP_HUGETLB failed, huge pages disabled: %m", allocsize);

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Andres Freund 2021-08-10 03:38:32 Re: Estimating HugePages Requirements?
Previous Message Bossart, Nathan 2021-08-09 23:48:34 Re: Estimating HugePages Requirements?

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-08-10 00:10:29 Re: Autovacuum on partitioned table (autoanalyze)
Previous Message Bossart, Nathan 2021-08-09 23:48:34 Re: Estimating HugePages Requirements?