Re: Hash table scans outside transactions

From: Aidar Imamov <a(dot)imamov(at)postgrespro(dot)ru>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Hash table scans outside transactions
Date: 2025-05-20 21:32:16
Message-ID: 638cf4b115c4caebb1274034706439c6@postgrespro.ru
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ashutosh Bapat wrote 2023-03-28 15:58:
> Bumping it to attract some attention.
>
> On Tue, Mar 21, 2023 at 12:51 PM Ashutosh Bapat
> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>>
>> Hi,
>> Hash table scans (seq_scan_table/level) are cleaned up at the end of a
>> transaction in AtEOXact_HashTables(). If a hash seq scan continues
>> beyond transaction end it will meet "ERROR: no hash_seq_search scan
>> for hash table" in deregister_seq_scan(). That seems like a limiting
>> the hash table usage.
>>
>> Our use case is
>> 1. Add/update/remove entries in hash table
>> 2. Scan the existing entries and perform one transaction per entry
>> 3. Close scan
>>
>> repeat above steps in an infinite loop. Note that we do not
>> add/modify/delete entries in step 2. We can't use linked lists since
>> the entries need to be updated or deleted using hash keys. Because the
>> hash seq scan is cleaned up at the end of the transaction, we
>> encounter error in the 3rd step. I don't see that the actual hash
>> table scan depends upon the seq_scan_table/level[] which is cleaned up
>> at the end of the transaction.
>>
>> I have following questions
>> 1. Is there a way to avoid cleaning up seq_scan_table/level() when the
>> transaction ends?
>> 2. Is there a way that we can use hash table implementation in
>> PostgreSQL code for our purpose?
>>
>>
>> --
>> Best Wishes,
>> Ashutosh Bapat

Hi!
I tried to resend this thread directly to myself, but for some reason it
ended up in the whole hackers list..

I thought I'd chime in on this topic since it hasn't really been
discussed
anywhere else (or maybe I missed it).
I've attached two patches: the first one is a little test extension to
demonstrate the problem (just add "hash_test" to
"shared_preload_libraries"),
and the second is a possible solution. Basically, the idea is that we
don't
reset the scan counter if we find scans that started outside of the
current
transaction at the end. I have to admit, though, that I can't
immediately
say what other implications this might have or what else we need to
watch
out for if we try this.
Maybe any thoughts on that?

regards,
Aidar Imamov

Attachment Content-Type Size
exampl_ext.patch text/x-diff 3.4 KB
mb_solution.patch text/x-diff 1.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sami Imseih 2025-05-20 21:50:12 Re: queryId constant squashing does not support prepared statements
Previous Message Jose Luis Tallon 2025-05-20 21:24:18 Re: Violation of principle that plan trees are read-only