Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
Date: 2019-03-07 09:06:24
Message-ID: CAH2-Wzkt1yjr1XY4Afftv4iUP3T=kcfM3NRRtihqQY+jzeYxCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 6, 2019 at 11:41 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> I don't understand it :-(. I guess that's valuable feedback on its own.
> I'll spend more time reading the code around that, but meanwhile, if you
> can think of a simpler way to explain it in the comments, that'd be good.

One more thing on this: If you force bitmap index scans (by disabling
index-only scans and index scans with the "enable_" GUCs), then you
get EXPLAIN (ANALYZE, BUFFERS) instrumentation for the index alone
(and the heap, separately). No visibility map accesses, which obscure
the same numbers for a similar index-only scan.

You can then observe that most searches of a single value will touch
the bare minimum number of index pages. For example, if there are 3
levels in the index, you should access only 3 index pages total,
unless there are literally hundreds of matches, and cannot avoid
storing them on more than one leaf page. You'll see that the scan
touches the minimum possible number of index pages, because of:

* Many duplicates strategy. (Not single value strategy, which I
incorrectly mentioned in relation to this earlier.)

* The !minusinfykey optimization, which ensures that we go to the
right of an otherwise-equal pivot tuple in an internal page, rather
than left.

* The "continuescan" high key patch, which ensures that the scan
doesn't go to the right from the first leaf page to try to find even
more matches. The high key on the same leaf page will indicate that
the scan is over, without actually visiting the sibling. (Again, I'm
assuming that your search is for a single value.)

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2019-03-07 09:26:06 Re: [HACKERS] proposal: schema variables
Previous Message David Steele 2019-03-07 08:56:54 Re: Re: PostgreSQL vs SQL/XML Standards