Re: dshash_find_or_insert vs. OOM

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Sami Imseih <samimseih(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: dshash_find_or_insert vs. OOM
Date: 2026-03-26 23:32:52
Message-ID: acXCJODjsCytdpwT@paquier.xyz
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 26, 2026 at 04:26:33PM -0400, Andres Freund wrote:
> I think tests like this do have value and I'd definitely run them first while
> hacking on code related to dshash, rather than relying on the regression tests
> or such. E.g. having test_aio was invaluable to being able to get AIO into a
> stable state. When hacking on something with complicated edge cases I'd just
> add a test for it, making development faster as well as ensuring the
> complicated case continues to work into the future.

These test modules have a lot of value because they are cheap to run
and are very usually able to reproduce edge cases that no other place
of the code tree would be able to reach in a predictible way. Cheap,
fast and reliable is good. On top of that they can serve as code
template. Bonus points.

> However, creating its own test module for small parts of the codebase doesn't
> quite make sense to me. A pretty decent chunk of the test is just boilerplate
> to add a new module, and every new test module requires its own cluster, which
> adds a fair bit of runtime overhead, particularly on windows. I think
> test_dsa, test_dsm_registry, test_dshash should just be one combined test
> module, they're testing quite closely related code.

Yeah, perhaps grouping all the DSA things into a single module would
be OK, with a parallel schedule that would speed up things. It
depends on the complexity and the size of the module to me.

Saying that, I think that the shape of the proposed test_dshash is
wrong: it proposes one SQL function that does a bunch of
dshash-related operations in a single function call, in a random
manner. We have a shared memory state that can survive across SQL
calls, making it a set of thinner SQL function that wrap directly
dshash calls able to manipulate the table would feel much more natural
to me. And it would be easier to design edge cases in the SQL
scripts themselves.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2026-03-26 23:33:27 Re: [PATCH v1] Replace sprintf() with snprintf() in libpq for safety Anexo: o arquivo
Previous Message Robert Haas 2026-03-26 23:25:57 Re: pg_plan_advice