From: | Tomas Vondra <tomas(at)vondra(dot)me> |
---|---|
To: | Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Aidar Imamov <a(dot)imamov(at)postgrespro(dot)ru> |
Cc: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Hash table scans outside transactions |
Date: | 2025-05-25 15:05:59 |
Message-ID: | c733d646-12d6-480f-874f-22a97db77c09@vondra.me |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 5/25/25 13:36, Dilip Kumar wrote:
> On Wed, May 21, 2025 at 3:02 AM Aidar Imamov <a(dot)imamov(at)postgrespro(dot)ru> wrote:
>>
>> Hi!
>> I tried to resend this thread directly to myself, but for some reason it
>> ended up in the whole hackers list..
>>
>> I thought I'd chime in on this topic since it hasn't really been
>> discussed
>> anywhere else (or maybe I missed it).
>> I've attached two patches: the first one is a little test extension to
>> demonstrate the problem (just add "hash_test" to
>> "shared_preload_libraries"),
>> and the second is a possible solution. Basically, the idea is that we
>> don't
>> reset the scan counter if we find scans that started outside of the
>> current
>> transaction at the end. I have to admit, though, that I can't
>> immediately
>> say what other implications this might have or what else we need to
>> watch
>> out for if we try this.
>> Maybe any thoughts on that?
>
> I haven't reviewed the complete patch or tested it, but I don't see
> any issues with it.
>
I may be wrong, but I'd guess the restriction is there for a reason.
Maybe it's not needed anymore, or maybe it's needed only in some cases,
or something like that.
So the most important thing for a patch removing the restriction it is
to explain why it's there and why it's safe to relax it. The extension
demonstrating that the restriction exists doesn't really help with that,
it shows why we have it. Not hat it's safe to remove it.
I'm not a dynahash expert, but isn't the first sentence from dynahash.c
relevant:
* dynahash.c supports both local-to-a-backend hash tables and hash
* tables in shared memory. For shared hash tables, it is the caller's
* responsibility to provide appropriate access interlocking. The
* simplest convention is that a single LWLock protects the whole hash
* table. Searches (HASH_FIND or hash_seq_search) need only shared
* lock, but any update requires exclusive lock.
In other words, let's say you have a hash table, protected by LWLock.
That is, you're holding the lock in shared mode during seqscan. And then
we drop the locks for some reason (say, transaction end). The table can
be modified - won't that break the seqscan?
FWIW I think with the use case from the beginning of this thread:
1. Add/update/remove entries in hash table
2. Scan the existing entries and perform one transaction per entry
3. Close scan
Why not to simply build a linked list after step (1)?
regards
--
Tomas Vondra
From | Date | Subject | |
---|---|---|---|
Next Message | Daniil Davydov | 2025-05-25 17:22:42 | Re: POC: Parallel processing of indexes in autovacuum |
Previous Message | Dean Rasheed | 2025-05-25 12:41:40 | Re: MERGE issues around inheritance |