Re: Minmax indexes

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Minmax indexes
Date: 2014-08-06 02:35:01
Message-ID: 20140806023501.GF9388@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

FWIW I think I haven't responded appropriately to the points raised by
Heikki. Basically, as I see it there are three main items:

1. the revmap physical-to-logical mapping is too complex; let's use
something else.

We had revmap originally in a separate fork. The current approach grew
out of the necessity of putting it in the main fork while ensuring that
fast access to individual pages is possible. There are of course many
ways to skin this cat; Heikki's proposal is to have it always occupy the
first few physical pages, rather than require a logical-to-physical
mapping table. To implement this he proposes to move other pages out of
the way as the index grows. I don't really have much love for this
idea. We can change how this is implemented later in the cycle, if we
find that a different approach is better than my proposal. I don't want
to spend endless time meddling with this (and I definitely don't want to
have this delay the eventual commit of the patch.)

2. vacuuming is not optimal

Right now, to eliminate garbage index tuples we need to first scan
the revmap to figure out which tuples are unreferenced. There is a
concern that if there's an excess of dead tuples, the index becomes
unvacuumable because palloc() fails due to request size. This is
largely theoretical because in order for this to happen there need to be
several million dead index tuples. As a minimal fix to alleviate this
problem without requiring a complete rework of vacuuming, we can cap
that palloc request to maintenance_work_mem and remove dead tuples in a
loop instead of trying to remove all of them in a single pass.

Another thing proposed was to store range numbers (or just heap page
numbers) within each index tuple. I felt that this would add more bloat
unnecessarily. However, there is some padding space in index tuple that
maybe we can use to store range numbers. I will think some more about
how we can use this to simplify vacuuming.

3. avoid MMTuple as it is just unnecessary extra complexity.

The main thing that MMTuple adds is not the fact that we save 2 bytes
by storing BlockNumber as is instead of within a TID field. Instead,
it's that we can construct and deconstruct using our own design, which
means we can use however many Datum entries we want and however many
"null" flags. In normal heap and index tuples, there are always the
same number of datum/nulls. In minmax, the number of nulls is twice the
number of indexed columns; the number of datum values is determined by
how many datum values are stored per opclass ("sortable" opclasses
store 2 columns, but geometry would store only one). If we were to use
regular IndexTuples, we would lose that .. and I have no idea how it
would work. In other words, MMTuples look fine to me.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-08-06 02:37:27 Re: how to debug into InitPostgres() and InitCatalogCache()?
Previous Message Peter Geoghegan 2014-08-06 02:32:35 Re: B-Tree support function number 3 (strxfrm() optimization)