Re: Shared hash table allocations

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>
Subject: Re: Shared hash table allocations
Date: 2026-04-02 11:52:07
Message-ID: a47e1b92-2e88-4554-b4d3-61934173222d@iki.fi
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/04/2026 13:24, Matthias van de Meent wrote:
> On Tue, 31 Mar 2026 at 23:25, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>>
>> 0003: In patch 0003 I removed that flexibility by marking them both with
>> HASH_FIXED_SIZE, and making init_size equal to max_size. That also stops
>> the hash tables from using any of the other remaining wiggle room,
>> making them truly fixed-size.
>
> I think this patch finally gave me a good reason why PROCLOCK would've
> needed to be allocated with double the sizes of LOCK:
>
> LOCK is (was) initialized with only 50% of its max capacity. If
> PROCLOCK was initialized with the same parameters and all spare shmem
> is then allocated to other processes, then backends wouldn't be able
> to safely use max_locks_per_transaction. To guarantee no OOMs when all
> backends use max_locks_per_transaction, PROCLOCK's size must be
> doubled to make sure PROCLOCK has sufficient space. (The same isn't
> usually an issue for LOCK, because it's very likely backends will
> operate on the same tables, and thus will be able to share most of the
> LOCK structs.)

Hmm, I don't know if that makes sense. It can happen that you have a lot
of backends acquiring the same, smaller set of locks, growing PROCLOCK
so that it uses up all the available wiggle room, and LOCK can never
grow from its initial size, 1/2 * max_locks_per_transactions *
MaxBackends. If the workload then changes so that every backend tries to
acquire exactly max_locks_per_transactions locks, but this time each
lock is on a different object, you will run out of shared memory at 1/2
the size of what you expected.

The opposite can't happen, because PROCLOCK is always at least as large
as LOCK. It doesn't matter what you set PROCLOCK's initial size to, it
will grow together with LOCK, and you will not run out of shared memory
before PROCLOCK has grown up to max_locks_per_transactions * MaxBackends
anyway.

> Now that LOCK is fully allocated, I think the size doubling can be
> removed, or possibly parameterized for those that need it.

I don't think that follows. The 2x factor is pretty arbitrary, but it's
still a fair assumption that many backends will be acquiring locks on
the same objects so you need more space in PROCLOCK than in LOCK.

I don't know how true that assumption is. It feels right for OLTP
applications. But the situation where I've hit max_locks_per_transaction
is when I've tried to create one table with thousands or partitions. Or
rather, when I try to *drop* that table. In that situation, there's just
one transaction acquiring all the locks, so the PROCLOCK / LOCK ratio is 1.

We could parameterize it, but I feel that's probably overkill and
exposing too much detail to users. At the end of the day, if you hit the
limit, you just bump up max_locks_per_transactions. If there are two
settings, it's more complicated; which one do you change? You probably
don't mind wasting the few MB of memory that you could gain by carefully
tuning the LOCK / PROCLOCK factor.
- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matheus Alcantara 2026-04-02 12:18:50 Re: Eager aggregation, take 3
Previous Message Tomas Vondra 2026-04-02 11:51:46 Re: Change default of jit to off