Re: Shared memory size computation oversight?

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Georgios <gkokolatos(at)protonmail(dot)com>
Cc: Julien Rouhaud <julien(dot)rouhaud(at)free(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>
Subject: Re: Shared memory size computation oversight?
Date: 2021-03-03 16:36:08
Message-ID: 20210303163608.lpkz6n226qj2b4bn@nol
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 03, 2021 at 03:37:20PM +0000, Georgios wrote:
>
> Please excuse me as I speak most from the side of ignorance. I am slightly curious
> to understand something in your patch, if you can be kind enough to explain it to
> me.
>
> The commit 09adc9a8c09 you pointed out to, as far as I can see, changed the total
> size of the shared memory, not individual bits. It did explain that the users of
> that space had properly padded their data, but even assuming that they did not do
> that as asked, the result would (or rather should) be cache misses, not running out
> of reserved memory.

It actually changed the allocation of individual bits inside the fixed size
shared memory, as underlying users ends up calling shmemalloc(),
but did not accurately changed the actual total size of shared memory as many
estimates are done either still using MAXALIGN or simply not accounting for the
padding.

> My limited understanding is also based in a comment in CreateSharedMemoryAndSemaphores()
>
> /*
> * Size of the Postgres shared-memory block is estimated via
> * moderately-accurate estimates for the big hogs, plus 100K for the
> * stuff that's too small to bother with estimating.
> *
> * We take some care during this phase to ensure that the total size
> * request doesn't overflow size_t. If this gets through, we don't
> * need to be so careful during the actual allocation phase.
> */
> size = 100000;
>
> Of course, my argument falls bare, if the size estimates of each of the elements is
> rather underestimated. Indeed this is the argument in the present patch expressed in
> code in hash_estimate_size most prominently, here:
>
> size = add_size(size,
> - mul_size(nElementAllocs,
> - mul_size(elementAllocCnt, elementSize)));
> + CACHELINEALIGN(Nmul_size(nElementAllocs,
> + mul_size(elementAllocCnt, elementSize))));
>
> (small note, Nmul_size is a typo of mul_size, but that is minor, by amending it the
> code compiles).

Oops, I'll fix that.

> To conclude, the running out of shared memory, seems to me to be fixed rather
> vaguely with this patch. I am not claiming that increasing the memory used by
> the elements is not needed, I am claiming that I can not clearly see how is that
> specific increase needed.

I'm also not entirely convinced that those fixes are enough to explain the "out
of shared memory" issue originally reported, but that's the only class of
problem that I could spot which could possibly explain it.

As Robert initially checked, it should be at worse a 6kB understimate with
default parameters (it may be slightly more now as we have more shared memory
users, but it should be the same order of magnitude), and I don't think that
there it will vary a lot with huge shared_buffers and/or max_connections.

Those 6kB should be compared to how much room is giving the initial 100k vs how
much the small stuff is actually computing.

Note that there are still some similar issue in the code. A quick look at
ProcGlobalShmemSize() vs InitProcGlobal() seems to indicate that it's missing 5
CACHELINEALIGN in the estimate. Unless someone object, I'll soon do a full
review of all the estimates in CreateSharedMemoryAndSemaphores() and fix all
additional occurences that I can find.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2021-03-03 16:40:52 Re: Shared memory size computation oversight?
Previous Message 'Alvaro Herrera' 2021-03-03 16:34:30 Re: PATCH: Batch/pipelining support for libpq