### Buffer count variables

The pool size is tracked with two atomic water marks in `DynamicSharedBuffersControl`:

- **`lowNBuffers`** — the lower bound. Backends only allocate buffers from `[0, lowNBuffers)`.
- **`highNBuffers`** — the upper bound. Buffer descriptor memory in
  `[0, highNBuffers)` is allocated and initialized.

We also maintain three variables used at postmaster startup only.
- **`NBuffers`** — Deprecated and removed.
- **`NBuffersGUC`** — Backs the `shared_buffers` GUC. It captures the
  initial pool size at postmaster startup. `SHOW shared_buffers` shows `lowNBuffers` value.
- **`MaxNBuffers`** — Backs the `max_shared_buffers` GUC and it's the upper limit of **`highNBuffers`**.

Invariants:

- `MIN_SHARED_BUFFERS (= 16) <= lowNBuffers <= highNBuffers <= MaxNBuffers`.
- In steady state (no resize in progress), `lowNBuffers == highNBuffers`.
- `lowNBuffers < highNBuffers` only while a shrink is in flight.


##### Steady state at 4 CU

| Variable         | Value                                    |
| ---------------- | ---------------------------------------- |
| `MaxNBuffers`    | ~5M  (~41 GB, sized for 8 CU)            |
| `lowNBuffers`    | ~3M                                      |
| `highNBuffers`   | ~3M                                      |

### When to use which
- Buffer allocation code (clock-sweep, freelist, ring-buffer sizing, etc.) should call `GetLowNBuffers()`, which returns `lowNBuffers`.
- Functions that visit the buffer array (e.g. `DropRelationBuffers`) must call `BEGIN_NBUFFERS_ACCESS(localNBuffers)` and `END_NBUFFERS_ACCESS(localNBuffers)` when it completes visiting the buffer array. `BEGIN_NBUFFERS_ACCESS` records `localNBuffers = highNBuffers` while holding `AccessNBuffersLock` in shared mode, which is what makes the snapshot safe to dereference for the duration of the access.
- `GetHighNBuffers()` is appropriate only for **non-critical sizing decisions** that can tolerate a stale value (e.g. picking the ring-buffer size for a new strategy). It must **never** be used to bound a loop that reads buffer descriptors or buffer blocks: by the time you indexed into the array, the resize coordinator may have unmapped the underlying pages. Use `BEGIN_NBUFFERS_ACCESS`/`END_NBUFFERS_ACCESS` for that.


### Triggering a resize

A resize operation is a single function call:

```sql
SELECT pg_resize_shared_buffers('<new_size>');
```


### Shrink

The initial state of the memory region is:

```
                                       lowNBuffers
 0               new_size              highNBuffers          MaxNBuffers
|--------------------|---------------------|--------------------------|
       ALLOCATED              ALLOCATED                  RESERVED
```

Shrink performs these steps:
1. Reset the clock-sweep cursor, then publish `lowNBuffers := new_size`.
2. Barrier: wait for all backends to acknowledge the new `lowNBuffers`.
3. Purge any freelist entries above `lowNBuffers`.
4. Evict buffers in the `[lowNBuffers, highNBuffers)` range.
5. Acquire `AccessNBuffersLock` exclusively, set `highNBuffers = lowNBuffers`, release the lock.
6. Free physical memory in `[lowNBuffers, old_highNBuffers)`[^1].

After step-1, the memory region becomes:
```
 0             lowNBuffers             highNBuffers               MaxNBuffers
|--------------------|---------------------|--------------------------|
       ALLOCATED              TO EVICT                  RESERVED
```

- `[0, lowNBuffers)`: Backends allocate buffers in this range.
- `[lowNBuffers, highNBuffers)`: The coordinator will evict buffers in this range.
- `[highNBuffers, MaxNBuffers)`: The reserved range and should not be used.

When shrink completes,
```
                  lowNBuffers
 0                highNBuffers                                MaxNBuffers
|--------------------|------------------------------------------------|
       ALLOCATED                                        RESERVED
```

[^1]: Huge pages significantly speeds up freeing memory. It takes less than a second to free 32 GB memory.

### Expand

The initial state of the memory region is:

```
                 lowNBuffers
 0               highNBuffers         new_size                  MaxNBuffers
|--------------------|---------------------|--------------------------|
       ALLOCATED              TO ALLOCATE              RESERVED
```

Expand performs these steps:
1. Allocate physical memory in `[lowNBuffers, new_size)` and initialize the new buffer descriptors.
2. Acquire `AccessNBuffersLock` exclusively.
3. Reset the clock-sweep cursor to point at the start of the new range, so the next clock sweep tries the freshly added empty buffers.
4. Publish `highNBuffers := new_size` then `lowNBuffers := new_size`.
5. Release the lock. Backends taking the lock in shared mode now see the fully-grown pool; concurrent atomics readers may briefly see `lowNBuffers < highNBuffers` between the two writes above, which is harmless since both bounds already cover initialized memory.

When expand completes,
```
                                       lowNBuffers
 0                                     highNBuffers          MaxNBuffers
|------------------------------------------|--------------------------|
       ALLOCATED                                         RESERVED
```

### Background-writer interaction

`BgBufferSync` keeps a local `static int saved_low_nbuffers` snapshot
and compares it against `GetLowNBuffers()` on every invocation. Whenever the
value differs, a resize has happened: the smoothed allocation rate /
clean-buffer density it was tracking are no longer meaningful, so it
invalidates `saved_info_valid` and starts fresh.

### Coordinating with backends that visit the buffer array

Special coordination must be done with backends that scan buffers based on
the upper bound (`highNBuffers`), e.g., `DropRelationBuffers`.
Otherwise, a backend visiting a buffer in `[lowNBuffers, highNBuffers]` will hit SEGV when a shrink operation frees the memory in  `[lowNBuffers, highNBuffers]` range.

Coordination is done via `BEGIN_NBUFFERS_ACCESS(localNBuffers);` and
`END_NBUFFERS_ACCESS(localNBuffers)`. A backend calls `BEGIN_NBUFFERS_ACCESS()`
before visiting the buffer array and `END_NBUFFERS_ACCESS()` afterwards.

`BEGIN_NBUFFERS_ACCESS` acquires `AccessNBuffersLock` in shared mode for
the duration of visiting the buffer array. The resize coordinator acquires
`AccessNBuffersLock` in exclusive mode around the operations that mutate the
buffer pool memory and publish a new `highNBuffers` (free during shrink /
allocate during expand). This both waits for any in-flight backend to
complete and blocks new ones from starting with a stale view of
`highNBuffers`.

### Error handling

A failure during resize will NOT bring down postgres.

`pg_resize_shared_buffers()` is interruptible: SIGINT/CTRL-C and
`pg_terminate_backend()` are honoured, and any `ereport(ERROR)` raised from
inside a resize step (e.g. an OOM, a `madvise()` failure) propagates back to
the caller after cleanup runs.

We can rollback shrink when it is interrupted between lowering `lowNBuffers`
and lowering `highNBuffers`. The memory in `[lowNBuffers, highNBuffers)` is
still mapped  so the rollback is safe. Buffers that were already evicted in the
partial run come back as empty and are picked up by the clock sweep on the
next allocation attempt; buffers that the partial purge moved off the
freelist come back via the normal `StrategyFreeBuffer()` lifecycle.

An error at `madvise(MADV_REMOVE)` step cannot be rolled back.
By the time we call madvise both water marks are already at `new_size`,
so the buffer pool is self-consistent at the smaller size.
The caller may try expand later.

`madvise(MADV_POPULATE_WRITE)` at expand may fail.
When this happens, `lowNBuffers` and `highNBuffers` remain at
`old_size` and backends keep using the old (smaller) pool. The
partially-touched bytes in the [old_size, new_size) range sit unused in
shmem until a future successful expand re-initializes them. The
coordinator surfaces this to the caller as a hard `ERROR`.