Re: dshash_find_or_insert vs. OOM

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: dshash_find_or_insert vs. OOM
Date: 2026-03-18 01:33:55
Message-ID: 11E4DBC2-7DE2-4CD9-8D64-EA30B2937193@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Mar 18, 2026, at 07:34, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> Hi,
>
> For most memory allocation primitives, we offer a "no OOM" version.
> dshash_find_or_insert is an exception. Here's a small patch to fix
> that. I have some interest in slipping this into v19 even though I am
> submitting it quite late, because it would be useful for
> pg_stash_advice[1]. Let me know what you think about that.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com
>
> [1] As yet uncommitted. See pg_plan_advice thread.
> <v1-0001-dshash-Make-it-possible-to-suppress-out-of-memory.patch>

I think adding a DSHASH_INSERT_NO_OOM flag is the right choice. As you mentioned, we already have other _NO_OOM flags, so this feels consistent with existing patterns.

I don’t see any correctness issue with the patch. I just have a couple of minor nits.

1
```
@@ -477,14 +485,22 @@ restart:
* reacquire all the locks in the right order to avoid deadlocks.
*/
LWLockRelease(PARTITION_LOCK(hash_table, partition_index));
- resize(hash_table, hash_table->size_log2 + 1);
+ if (!resize(hash_table, hash_table->size_log2 + 1, flags))
+ return NULL;

goto restart;
}

/* Finally we can try to insert the new item. */
item = insert_into_bucket(hash_table, key,
- &BUCKET_FOR_HASH(hash_table, hash));
+ &BUCKET_FOR_HASH(hash_table, hash),
+ flags);
+ if (item == NULL)
+ {
+ Assert((flags & DSHASH_INSERT_NO_OOM) != 0);
+ LWLockRelease(PARTITION_LOCK(hash_table, partition_index));
+ return NULL;
+ }
```

When OOM happens, Assert((flags & DSHASH_INSERT_NO_OOM) != 0); makes sense. But for resize(), the assert is inside resize(), while for insert_into_bucket(), the assert is in the caller. That feels a bit inconsistent to me, and I think it hurts readability a little. A reader might wonder why there is no corresponding assert after resize() unless they go read the function body.

I think either style is fine, but using the same style in both places would be better.

2
```
/* Allocate the space for the new table. */
- new_buckets_shared =
- dsa_allocate_extended(hash_table->area,
- sizeof(dsa_pointer) * new_size,
- DSA_ALLOC_HUGE | DSA_ALLOC_ZERO);
- new_buckets = dsa_get_address(hash_table->area, new_buckets_shared);
+ {
+ int dsa_flags = DSA_ALLOC_HUGE | DSA_ALLOC_ZERO;
+
+ if (flags & DSHASH_INSERT_NO_OOM)
+ dsa_flags |= DSA_ALLOC_NO_OOM;
+ new_buckets_shared =
+ dsa_allocate_extended(hash_table->area,
+ sizeof(dsa_pointer) * new_size,
+ dsa_flags);
+ }
```

Making this a nested block does have the benefit of keeping dsa_flags close to where it is used. But from my impression, this style is still fairly uncommon in the codebase. I worry it may implicitly signal to other hackers that this is an acceptable pattern. So unless we intentionally want to encourage that style, I would lean toward avoiding it here.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2026-03-18 01:43:06 DOCS: typo on CLUSTER page
Previous Message Robert Haas 2026-03-17 23:34:17 dshash_find_or_insert vs. OOM