From: | Andy Fan <zhihuifan1213(at)163(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Cc: | Mircea Cadariu <cadariu(dot)mircea(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Returning nbtree posting list TIDs in DESC order during backwards scans |
Date: | 2025-07-27 13:42:53 |
Message-ID: | 87h5yx92ci.fsf@163.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
> On Thu, Jul 17, 2025 at 2:26 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>> The loop has an early check for this (for non-itemDead entries) here:
>>
>> /* Quickly skip over items never ItemDead-set by btgettuple */
>> if (!kitem->itemDead)
>> continue;
>>
>> I really doubt that this matters
> I'll need to think about issues around adding the new kitem->itemDead
> bitfield. I'm not really worried about _bt_killitems; more so about
> the routines called by _bt_readpage, which must make sure that the bit
> is unset every time a TID is saved in so->currPos.items[].
Yes, otherwise the live tuple maybe marked as dead, which is terrible. I
think you have unset the bit on all the needed places, including
_bt_saveitem, _bt_savepostingitem and _bt_setuppostingitems for the
first item in postlist item. I can't find out anything is missed.
(I thought another place for this work might be _bt_returnitem, this
might be a more centralized place to set. Later I think it is totally
wrong because for the queries like SELECT * FROM t WHERE idx_col = 3
LIMIT 3; Some items in so->currPos.items[] were filled in
_bt_readpage but maybe never returned to heap at all, and later
_bt_killitems also run against it on the whole, so unsetting the bit on
_bt_returnitem is too late...)
I think this patch does two things. one is refactoring the data struct
for _bt_killitems, the other one is scaning the postlist in the backward
way for the backward scan. then does the below changes belongs to any of
them? Is it an intentional change?
_bt_killitems:
if (offnum < minoff)
- continue; /* pure paranoia */
+ {
+ /*
+ * Page must have originally been the rightmost page, but has
+ * since split
+ */
+ Assert(!so->dropPin);
+ offnum = minoff;
+ }
At last, I can get the same test result as you. The buffer hits go back to
23 in the test case, thank for working on this!
> I think that this patch isn't too far off being committable.
I think so.
--
Best Regards
Andy Fan
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Burd | 2025-07-27 14:12:11 | Re: [PATCH] Let's get rid of the freelist and the buffer_strategy_lock |
Previous Message | Andrey Borodin | 2025-07-27 11:53:45 | Re: IPC/MultixactCreation on the Standby server |