|From:||Heikki Linnakangas <hlinnaka(at)iki(dot)fi>|
|To:||Peter Geoghegan <pg(at)bowt(dot)ie>|
|Cc:||Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>|
|Subject:||Re: Making all nbtree entries unique by having heap TIDs participate in comparisons|
|Views:||Raw Message | Whole Thread | Download mbox|
On 13/03/2019 03:28, Peter Geoghegan wrote:
> It would be great if you could take a look at the 'Add high key
> "continuescan" optimization' patch, which is the only one you haven't
> commented on so far (excluding the amcheck "relocate" patch, which is
> less important). I can put that one off for a while after the first 3
> go in. I will also put off the "split after new item" commit for at
> least a week or two. I'm sure that the idea behind the "continuescan"
> patch will now seem pretty obvious to you -- it's just taking
> advantage of the high key when an index scan on the leaf level (which
> uses a search style scankey, not an insertion style scankey) looks
> like it may have to move to the next leaf page, but we'd like to avoid
> it where possible. Checking the high key there is now much more likely
> to result in the index scan not going to the next page, since we're
> more careful when considering a leaf split point these days. The high
> key often looks like the items on the page to the right, not the items
> on the same page.
Oh yeah, that makes perfect sense. I wonder why we haven't done it like
that before? The new page split logic makes it more likely to help, but
even without that, I don't see any downside.
I find it a bit confusing, that the logic is now split between
_bt_checkkeys() and _bt_readpage(). For a forward scan, _bt_readpage()
does the high-key check, but the corresponding "first-key" check in a
backward scan is done in _bt_checkkeys(). I'd suggest moving the logic
completely to _bt_readpage(), so that it's in one place. With that,
_bt_checkkeys() can always check the keys as it's told, without looking
at the LP_DEAD flag. Like the attached.
|Next Message||MikalaiKeida||2019-03-14 11:01:06||RE: Timeout parameters|
|Previous Message||Kyotaro HORIGUCHI||2019-03-14 10:55:53||Re: Is PREPARE of ecpglib thread safe?|