From: | Matheus Alcantara <matheusssilv97(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Batch TIDs lookup in ambulkdelete |
Date: | 2025-05-13 21:25:40 |
Message-ID: | CAFY6G8dPxGOtHQgvvhu4zxnggk6Y0j2XHmtPhFAavCSbcjjQCw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 30/04/25 19:36, Masahiko Sawada wrote:
> Here are the summary of each attached patch:
>
> 0001: Introduce TIdStoreIsMemberMulti() where we can do IsMember check
> for multiple TIDs in one function call. If the given TIDs are sorted
> (at least in block number), we can save radix tree lookup for the same
> page entry.
>
> 0002: Convert IndexBUlkDeleteCallback() to batched operation.
>
> 0003: Use batch TIDs lookup in btree index bulk-deletion.
>
> In patch 0003, we implement batch TID lookups for both each
> deduplicated index tuple and remaining all regular index tuples, which
> seems to be the most straightforward approach. While further
> optimizations are possible, such as performing batch TID lookups for
> all index tuples on a single page, these could introduce additional
> overhead from sorting and re-sorting TIDs. Moreover, considering that
> radix tree lookups are relatively inexpensive, the benefits of sorting
> TIDs and using TidStoreIsMemberMulti() might be minimal. Nevertheless,
> these potential optimizations warrant further evaluation to determine
> their actual impact on performance.
>
> Also, the patch includes the batch TIDs lookup support only for btree
> indexes but we potentially can improve other index AMs too.
>
The code looks good and also +1 for the idea. I just have some small
points:
- Maybe it would be good to mention somewhere that
IndexBulkDeleteCallback() callback returns the number of tids
members found on TidStore?
- The vac_tid_reaped() docs may need to be updated?
I also executed meson tests for each patch individually and the 0002
patch is not passing on my machine (MacOs).
Ok: 39
Expected Fail: 0
Fail: 271
Unexpected Pass: 0
Skipped: 22
Timeout: 0
One behaviour that I found by executing the 0002 tests is that it may be
leaking some shared memory segments. I notice that because after
executing the tests I tried to re-execute based on master and all tests
were failing with the "Failed system call was shmget(key=97530599,
size=56, 03600)" error. I also checked the shared memory segments using
"ipcs -m" and it returns some segments which is don't returned when I
execute the tests on master (after cleaning the leaked memory segments)
and it also doesn't occur when executing based on 0001 or 0003.
~/d/p/batch-tids-lookup-ambulkdelete ❯❯❯ ipcs -m
IPC status from <running system> as of Tue May 13 18:19:14 -03 2025
T ID KEY MODE OWNER GROUP
Shared Memory:
m 18087936 0x05f873bf --rw------- matheus staff
m 15925250 0x05f966fe --rw------- matheus staff
m 24248325 0x05f9677e --rw------- matheus staff
....
Note that the the 0003 patch don't have this issue so at the end we will
not have problem with this I think, but it maybe be good to mention that
although the patches are separate, there is a dependency between them,
which may cause issues on buildfarm?
--
Matheus Alcantara
From | Date | Subject | |
---|---|---|---|
Next Message | Scott Mead | 2025-05-13 21:53:13 | Re: Disable parallel query by default |
Previous Message | Greg Sabino Mullane | 2025-05-13 21:07:58 | Re: Disable parallel query by default |