| From: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
|---|---|
| To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
| Subject: | Re: dshash_find_or_insert vs. OOM |
| Date: | 2026-03-18 01:33:55 |
| Message-ID: | 11E4DBC2-7DE2-4CD9-8D64-EA30B2937193@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On Mar 18, 2026, at 07:34, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> Hi,
>
> For most memory allocation primitives, we offer a "no OOM" version.
> dshash_find_or_insert is an exception. Here's a small patch to fix
> that. I have some interest in slipping this into v19 even though I am
> submitting it quite late, because it would be useful for
> pg_stash_advice[1]. Let me know what you think about that.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com
>
> [1] As yet uncommitted. See pg_plan_advice thread.
> <v1-0001-dshash-Make-it-possible-to-suppress-out-of-memory.patch>
I think adding a DSHASH_INSERT_NO_OOM flag is the right choice. As you mentioned, we already have other _NO_OOM flags, so this feels consistent with existing patterns.
I don’t see any correctness issue with the patch. I just have a couple of minor nits.
1
```
@@ -477,14 +485,22 @@ restart:
* reacquire all the locks in the right order to avoid deadlocks.
*/
LWLockRelease(PARTITION_LOCK(hash_table, partition_index));
- resize(hash_table, hash_table->size_log2 + 1);
+ if (!resize(hash_table, hash_table->size_log2 + 1, flags))
+ return NULL;
goto restart;
}
/* Finally we can try to insert the new item. */
item = insert_into_bucket(hash_table, key,
- &BUCKET_FOR_HASH(hash_table, hash));
+ &BUCKET_FOR_HASH(hash_table, hash),
+ flags);
+ if (item == NULL)
+ {
+ Assert((flags & DSHASH_INSERT_NO_OOM) != 0);
+ LWLockRelease(PARTITION_LOCK(hash_table, partition_index));
+ return NULL;
+ }
```
When OOM happens, Assert((flags & DSHASH_INSERT_NO_OOM) != 0); makes sense. But for resize(), the assert is inside resize(), while for insert_into_bucket(), the assert is in the caller. That feels a bit inconsistent to me, and I think it hurts readability a little. A reader might wonder why there is no corresponding assert after resize() unless they go read the function body.
I think either style is fine, but using the same style in both places would be better.
2
```
/* Allocate the space for the new table. */
- new_buckets_shared =
- dsa_allocate_extended(hash_table->area,
- sizeof(dsa_pointer) * new_size,
- DSA_ALLOC_HUGE | DSA_ALLOC_ZERO);
- new_buckets = dsa_get_address(hash_table->area, new_buckets_shared);
+ {
+ int dsa_flags = DSA_ALLOC_HUGE | DSA_ALLOC_ZERO;
+
+ if (flags & DSHASH_INSERT_NO_OOM)
+ dsa_flags |= DSA_ALLOC_NO_OOM;
+ new_buckets_shared =
+ dsa_allocate_extended(hash_table->area,
+ sizeof(dsa_pointer) * new_size,
+ dsa_flags);
+ }
```
Making this a nested block does have the benefit of keeping dsa_flags close to where it is used. But from my impression, this style is still fairly uncommon in the codebase. I worry it may implicitly signal to other hackers that this is an acceptable pattern. So unless we intentionally want to encourage that style, I would lean toward avoiding it here.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Smith | 2026-03-18 01:43:06 | DOCS: typo on CLUSTER page |
| Previous Message | Robert Haas | 2026-03-17 23:34:17 | dshash_find_or_insert vs. OOM |