Re: Changing shared_buffers without restart

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Changing shared_buffers without restart
Date: 2025-06-16 12:39:17
Message-ID: CAExHW5sYg_d4O7oGRqbomnVODeqR3YNAeYAa526n1dsWCM=+Fg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 10, 2025 at 4:39 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:

Here's patchset rebased on f85f6ab051b7cf6950247e5fa6072c4130613555
with some more fixes as described below.

> 0001 - 0008 are same as the previous patchset
>
> 0009 adds support to shrink shared buffers. It has two changes: a. evict the buffers outside the new buffer size b. remove buffers with buffer id outside the new buffer size from the free list. If a buffer being evicted is pinned, the operation is aborted and a FATAL error is raised. I think we need to change this behaviour to be less severe like rolling back the operation or waiting for the pinned buffer to be unpinned etc. Better even if we could let users control the behaviour. But we need better infrastructure to do such things. That's one TODO left in the patch.
>

Patches upto 0009 are same as the previous patch set.

> 0010 is about reinitializing the Strategy reinitialization. Once we expand the buffers, the new buffers need to be added to the free list. Some StrategyControl area members (not all) need to be adjusted. That's what this patch does. But a deeper adjustment in BgBufferSync() and ClockSweepTick() is required. Further we need to do something about the buffer lookup table. More on that later in the email.

0010 is improved with fixes for background writer and clocksweeptick.
Now we just reset the information saved between calls to BgBufferSync
since it doesn't make sense after NBuffers has changed. Also the
members in StrategyControl related to ClockSweepTick are reset for the
same reason. More details in the commit message.

0011: GetBufferFromRing() invalidates the buffers beyond NBuffers
since those may have been added before resizing and are not valid
anymore. Details in commit message.

>
> 0011-0012 fix compilation issues in these patches but those fixes are not correct. The patches are there so that binaries can be built without any compilation issues and someone can experiment with buffer resizing. Good thing is the compilation fixes are in SQL callable functions pg_get_shmem_pagesize() and pg_get_shmem_numa(). So there's no ill-effect because of these patches as long as those two functions are not called.

These patches are now 0012 and 0013 respectively.

>
> Buffer lookup table resizing
> ------------------------------------
> The size of the buffer lookup table depends upon (number of shared buffers + number of partitions in the shared buffer lookup table). If we shrink the buffer pool, the buffer lookup table will become sparse but still useful. If we expand the buffers we need to expand the buffer lookup table too. That's not implemented in the current patchset. There are two solutions here:
>
> 1. We map a lot of extra address space (not memory) initially to accomodate for future expansion of shared buffer pool. Let's say that the total address space is sufficient to accomodate Nx buffers. Simple solution is to allocate a buffer lookup table with Nx initial entries so that we don't have to resize the buffer lookup table ever. It will waste memory but we might be ok with that as version 1 solution. According to my offline discussion with David Rowley, buffer lookups in sparse hash tables are inefficient because or more cacheline faults. Whether that translates to any noticeable performance degradation in TPS needs to be measured.
>
> 2. Alternate solution is to resize the buffer mapping table as well. This means that we rehash all the entries again which may take a longer time and the partitions will remain locked for that amount of time. Not to mention this will require non-trivial change to dynahash implementation.

I haven't spent time on this yet.

--
Best Wishes,
Ashutosh Bapat

Attachment Content-Type Size
0001-Allow-to-use-multiple-shared-memory-mapping-20250616.patch text/x-patch 30.0 KB
0002-Address-space-reservation-for-shared-memory-20250616.patch text/x-patch 21.6 KB
0004-Introduce-pending-flag-for-GUC-assign-hooks-20250616.patch text/x-patch 12.8 KB
0005-Introduce-pss_barrierReceivedGeneration-20250616.patch text/x-patch 7.3 KB
0003-Introduce-multiple-shmem-segments-for-share-20250616.patch text/x-patch 11.7 KB
0008-Support-resize-for-hugetlb-20250616.patch text/x-patch 4.3 KB
0006-Allow-to-resize-shared-memory-without-resta-20250616.patch text/x-patch 39.8 KB
0007-Use-anonymous-files-to-back-shared-memory-s-20250616.patch text/x-patch 10.7 KB
0010-Reinitialize-StrategyControl-after-resizing-20250616.patch text/x-patch 16.8 KB
0009-Support-shrinking-shared-buffers-20250616.patch text/x-patch 13.4 KB
0012-Fix-compilation-failure-in-pg_get_shmem_all-20250616.patch text/x-patch 1.4 KB
0011-Additional-validation-for-buffer-in-the-rin-20250616.patch text/x-patch 2.1 KB
0013-Fix-compilation-failure-in-pg_get_shmem_pag-20250616.patch text/x-patch 959 bytes

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2025-06-16 12:47:46 Huge commitfest app update upcoming: Tags, Draft CF, Help page, and automated commitfest creat/open/close
Previous Message Jelte Fennema-Nio 2025-06-16 12:02:23 Re: Commitfest app release at half June