Quick Links

Re: Hash table scans outside transactions

From:	Tomas Vondra <tomas(at)vondra(dot)me>
To:	Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Aidar Imamov <a(dot)imamov(at)postgrespro(dot)ru>
Cc:	Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Hash table scans outside transactions
Date:	2025-05-25 15:05:59
Message-ID:	c733d646-12d6-480f-874f-22a97db77c09@vondra.me
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 5/25/25 13:36, Dilip Kumar wrote:
> On Wed, May 21, 2025 at 3:02 AM Aidar Imamov <a(dot)imamov(at)postgrespro(dot)ru> wrote:
>>
>> Hi!
>> I tried to resend this thread directly to myself, but for some reason it
>> ended up in the whole hackers list..
>>
>> I thought I'd chime in on this topic since it hasn't really been
>> discussed
>> anywhere else (or maybe I missed it).
>> I've attached two patches: the first one is a little test extension to
>> demonstrate the problem (just add "hash_test" to
>> "shared_preload_libraries"),
>> and the second is a possible solution. Basically, the idea is that we
>> don't
>> reset the scan counter if we find scans that started outside of the
>> current
>> transaction at the end. I have to admit, though, that I can't
>> immediately
>> say what other implications this might have or what else we need to
>> watch
>> out for if we try this.
>> Maybe any thoughts on that?
>
> I haven't reviewed the complete patch or tested it, but I don't see
> any issues with it.
>

I may be wrong, but I'd guess the restriction is there for a reason.
Maybe it's not needed anymore, or maybe it's needed only in some cases,
or something like that.

So the most important thing for a patch removing the restriction it is
to explain why it's there and why it's safe to relax it. The extension
demonstrating that the restriction exists doesn't really help with that,
it shows why we have it. Not hat it's safe to remove it.

I'm not a dynahash expert, but isn't the first sentence from dynahash.c
relevant:

* dynahash.c supports both local-to-a-backend hash tables and hash
* tables in shared memory. For shared hash tables, it is the caller's
* responsibility to provide appropriate access interlocking. The
* simplest convention is that a single LWLock protects the whole hash
* table. Searches (HASH_FIND or hash_seq_search) need only shared
* lock, but any update requires exclusive lock.

In other words, let's say you have a hash table, protected by LWLock.
That is, you're holding the lock in shared mode during seqscan. And then
we drop the locks for some reason (say, transaction end). The table can
be modified - won't that break the seqscan?

FWIW I think with the use case from the beginning of this thread:

1. Add/update/remove entries in hash table
2. Scan the existing entries and perform one transaction per entry
3. Close scan

Why not to simply build a linked list after step (1)?

regards

--
Tomas Vondra

In response to

Re: Hash table scans outside transactions at 2025-05-25 11:36:49 from Dilip Kumar

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Daniil Davydov	2025-05-25 17:22:42	Re: POC: Parallel processing of indexes in autovacuum
Previous Message	Dean Rasheed	2025-05-25 12:41:40	Re: MERGE issues around inheritance