Re: [RFC] Minmax indexes

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] Minmax indexes
Date: 2013-06-17 20:12:48
Message-ID: 20130617201248.GE3537@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark wrote:
> On Fri, Jun 14, 2013 at 11:28 PM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
> > Re-summarization is relatively expensive, because the complete page range has
> > to be scanned.
>
> That doesn't sound too bad to me. It just means there's a downside to
> having larger page ranges. I would expect the page ranges to be
> something in the ballpark of 32 pages -- scanning 32 pages to
> resummarize doesn't sound that painful but sounds like it's large
> enough that the resulting index would be a reasonable size.

Actually, I'm thinking that a range is more like, say, 1280 pages, or 10
MB. My goal is to consider tables in the 10 TB magnitude; if I store
one index tuple for every 32 pages, I would end up with too large an
index. With 10 MBs per range I can index the whole table with an index
of 50 MB, which seems reasonable to scan. But anyway my intention is
that page range size is configurable.

> But I don't understand why an insert would invalid a tuple. An insert
> can just update the min and max incrementally. It's a delete that
> invalidates the range but as you note it doesn't really invalidate it,
> just mark it as needing a refresh -- and even then only if the value
> being deleted is equal to either the min or max.

No, I don't intend to update the index tuple with each heap insert. I
think this will end up being too slow. The validity map is there to
hold a single bit for each page saying whether the page range is known
valid or not; one insert into any page in the range invalidates the
range (and any scan of the table needs to scan that range as a whole).
This way I can wait until the storm of inserts has passed from a range,
and only then do the summarization for that range. This avoids having
to summarize the range over and over.

Alternatively, I could consider the index tuple always valid, and update
it online as soon as we do an insert or update (i.e. examine the min/max
values in the index tuple, and update it to match if the new value is
out of bounds). This seems too slow, so I won't try.

Also, a delete does not invalidate a range either. As Simon said
elsewhere in a thread, if the range is not minimal, this is not a
problem; we might have to scan some more ranges than absolutely
necessary, but it's not a correctness problem. The heap scan recheck
node will get rid of the unwanted tuples anyway.

> > Same-size page ranges?
> > Current related literature seems to consider that each "index entry" in a
> > minmax index must cover the same number of pages. There doesn't seem to be a
>
> I assume the reason for this in the literature is the need to quickly
> find the summary for a given page when you're handling an insert or
> delete.

Yeah, that's one thing to keep in mind. I haven't gotten too much into
this; I only added these two entries at the end for my own future
reference, because I will need to consider them at some point.

Right now my intention is to have each index have a fixed page range
size, which is defined at index creation time.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2013-06-17 20:23:40 Re: [RFC] Minmax indexes
Previous Message Andres Freund 2013-06-17 20:00:05 Re: Support for REINDEX CONCURRENTLY