Quick Links

Make TID Scans recalculate the TIDs less often

From:	David Rowley <dgrowleyml(at)gmail(dot)com>
To:	PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Make TID Scans recalculate the TIDs less often
Date:	2025-09-17 04:59:51
Message-ID:	CAApHDvoLMuLXakcAAsfjW=aKd_iFpqzf7k4G4wnQnym9RdSMsA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Over on [1], there was some concern about having to recalculate the
TID Scan's TidList on every recheck call. This work entails
evaluating all the TID expressions, sorting the resulting list of TIDs
and deduplicating it (so that we don't scan the same TID twice).

As it turns out, it certainly is possible that doing that work could
take up quite a bit of time, and having to do it as often as every
rescan *could* be noticed *if* the list is big enough and the rescans
are frequent enough. The following case demonstrates this:

set max_parallel_Workers_per_gather=0;
set enable_seqscan=0;
set enable_material=0;
set jit=0;
select sum(c) from million m left join lateral (select count(*) c from
empty where ctid in
('(1,1)','(1,2)','(1,3)','(1,4)','(1,5)','(1,6)','(1,7)','(1,8)','(1,9)','(1,10)','(1,11)','(1,12)','(1,13)','(1,14)','(1,15)','(1,16)','(1,17)','(1,18)','(1,19)','(1,20)','(1,21)','(1,22)','(1,23)','(1,24)','(1,25)','(1,26)','(1,27)','(1,28)','(1,29)','(1,30)','(1,31)','(1,32)','(1,33)','(1,34)','(1,35)','(1,36)','(1,37)','(1,38)','(1,39)','(1,40)','(1,41)','(1,42)','(1,43)','(1,44)','(1,45)','(1,46)','(1,47)','(1,48)','(1,49)','(1,50)','(1,51)','(1,52)','(1,53)','(1,54)','(1,55)','(1,56)','(1,57)','(1,58)','(1,59)','(1,60)','(1,61)','(1,62)','(1,63)','(1,64)','(1,65)','(1,66)','(1,67)','(1,68)','(1,69)','(1,70)','(1,71)','(1,72)','(1,73)','(1,74)','(1,75)','(1,76)','(1,77)','(1,78)','(1,79)','(1,80)','(1,81)','(1,82)','(1,83)','(1,84)','(1,85)','(1,86)','(1,87)','(1,88)','(1,89)','(1,90)','(1,91)','(1,92)','(1,93)','(1,94)','(1,95)','(1,96)','(1,97)','(1,98)','(1,99)','(1,100)'))
on 1=1;

master:

Time: 613.541 ms
Time: 621.037 ms
Time: 623.430 ms

patched:

Time: 298.863 ms
Time: 298.015 ms
Time: 297.172 ms

The part I don't know is if it's at all likely that someone would ever
hit this. We've added TID scan, so filtering on ctid must be common
enough to warrant having that code (yes, I know it's required for
WHERE CURRENT OF too), I just don't know how common rescans are in
those queries.

The patch optimises the recalc by changing things so the recalc is
only done when a parameter has changed that's mentioned somewhere in
TID quals. If no such parameter has changed, we use the same list as
we did on the last scan.

Does anyone think this is worth pursuing further?

Patch attached.

David

[1] https://postgr.es/m/4a6268ff-3340-453a-9bf5-c98d51a6f729@app.fastmail.com

Attachment	Content-Type	Size
v1-0001-Reduce-rescan-overheads-in-TID-Range-Scan.patch	application/octet-stream	10.3 KB

Responses

Re: Make TID Scans recalculate the TIDs less often at 2025-09-17 06:28:57 from Andrey Borodin

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Zhijie Hou (Fujitsu)	2025-09-17 05:18:37	RE: Parallel Apply
Previous Message	Tom Lane	2025-09-17 04:45:49	Re: Fixing MSVC's inability to detect elog(ERROR) does not return