| From: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
|---|---|
| To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
| Cc: | Andy Fan <zhihuifan1213(at)163(dot)com>, Mircea Cadariu <cadariu(dot)mircea(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Returning nbtree posting list TIDs in DESC order during backwards scans |
| Date: | 2025-12-03 12:31:43 |
| Message-ID: | 9037658C-159A-44B7-973A-093F6F18706A@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Peter,
The patch v4 looks carefully written and technically solid, and the core logic (switching killedItems[] to Bitmapset and reworking backwards posting list scans) is coherent.
I just got a few comments/questions:
> On Dec 3, 2025, at 11:08, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> Attached is v4, which does it that way.
>
> My plan is to commit this improved version in the next couple of days.
>
> --
> Peter Geoghegan
> <v4-0001-Return-TIDs-in-desc-order-during-backwards-scans.patch>
1
```
- /* Always invalidate so->killedItems[] before leaving so->currPos */
- so->numKilled = 0;
```
The old code only sets so->numKilled to 0 and reuse memory of so->killedItems[], now the new code always bms_free(so->killedItems) and re-alloc memory when next time adding a member to bms.
I am think that, if there were bms_clear(), then we could have just cleared the bitmap and reused the memory next time. But unfortunately, there is no such a bms_clear() now. What do you think to add bms_clear() and use it here? I don’t want to do that, I can try that once you push this patch.
2
```
+ /* Set up posting list state (and remember last TID) */
itemIndex--;
tupleOffset =
_bt_setuppostingitems(so, itemIndex, offnum,
- BTreeTupleGetPostingN(itup, 0),
+ BTreeTupleGetPostingN(itup, nitems - 1),
itup);
- /* Remember additional TIDs */
- for (int i = 1; i < BTreeTupleGetNPosting(itup); i++)
+
+ /* Remember all prior TIDs (must be at least one) */
+ for (int i = nitems - 2; i >= 0; i--)
{
itemIndex--;
_bt_savepostingitem(so, itemIndex, offnum,
```
I wonder if the comment “must be at lease one” should apply to the assignment of tupleOffset? The “for” loop starts from nitems-2, will it still must be at lease one item?
3
```
/*
- * Don't bother advancing the outermost loop's int iterator to
- * avoid processing killed items that relate to the same
- * offnum/posting list tuple. This micro-optimization hardly
- * seems worth it. (Further iterations of the outermost loop
- * will fail to match on this same posting list's first heap
- * TID instead, so we'll advance to the next offnum/index
- * tuple pretty quickly.)
+ * Don't advance itemIndex for outermost loop, no matter how
+ * nextIndex was advanced. It's possible that items whose
+ * TIDs weren't matched in posting list can still be killed
+ * (there might be a later tuple whose TID is a match).
*/
if (j == nposting)
killtuple = true;
```
I really don’t get what "Don't bother advancing the outermost loop's int iterator” means? Here we only set killtuple to true, then if (killtuple && !ItemIdIsDead(iid)), it breaks the inner while loop, in that case, outer while loop "while ((itemIndex = bms_next_member(so->killedItems, itemIndex)) >= 0)” will advance itemIndex.
I know this is a legacy comment, if you can explain a little bit, that will be very appreciated.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ignat Remizov | 2025-12-03 12:37:26 | Re: [PATCH] Add enable_copy_program GUC to control COPY PROGRAM |
| Previous Message | Hayato Kuroda (Fujitsu) | 2025-12-03 12:27:06 | RE: Using MyDatabaseId in SET_LOCKTAG_APPLY_TRANSACTION |