Re: Display number of heap accesses for index scans

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Display number of heap accesses for index scans
Date: 2017-11-05 21:04:12
Message-ID: 20171105210412.GA5006@marmot
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> wrote:
>The number of index lookups that failed to return anything can be a
>critical performance factor in OLTP workloads. Therefore it seems like
>it'd be a good idea to extend the explain analyze output to include that
>information.

I certainly agree.

I've sometimes wondered if we could do better here by exploiting unique
indexes' uniqueness property. The general idea is that we could avoid
visiting duplicates in the heap when we've determined that they cannot
be visible to our MVCC snapshot - we already encountered the visible
version for that value, so this must be true. This is somewhat like what
we do with HOT chains, except it occurs across index tuples with the
same value rather than across heap tuples within a HOT chain.

The big difficulty is LP_DEAD bit setting within B-Tree pages; if no one
is visiting the heap, no one can opportunistically mark the bit to
indicate that nobody needs to visit the heap tuple pointed to by the
index tuple (because the heap tuple is dead to everyone). Now, there
isn't an immediate problem with that, because the main point of both
optimizations is to avoid heap visits. But there is a problem when there
needs to be a page split and there is a lack of LP_DEAD bits set from
which to reclaim space to defer the page split (in nbtree, we can
reclaim the space needed for our an index tuple that needs to be
inserted without actually spliting, in the end). We cannot undermine the
secondary goal of the LP_DEAD/kill_prior_tuple optimization, which is
reclaiming space early within B-Tree pages.

I can imagine a way of addressing this problem, though it is very
invasive - versioning distinct index tuple versions in the index with
ordinal version numbers. I wonder if anyone else has thought about doing
something like this, and has a better idea, though. I think that we
could be cleverer about unique indexes in a number of different ways
[1]. Reducing heap accesses is a big one.

[1] https://wiki.postgresql.org/wiki/Key_normalization#Making_all_items_in_the_index_unique_by_treating_heap_TID_as_an_implicit_last_attribute
--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Van Fleet 2017-11-05 21:33:24 Re: [POC] Faster processing at Gather node
Previous Message Robert Haas 2017-11-05 20:23:01 Re: WIP: long transactions on hot standby feedback replica / proof of concept