From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Greg Stark <stark(at)mit(dot)edu> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [RFC] Minmax indexes |
Date: | 2013-06-17 20:12:48 |
Message-ID: | 20130617201248.GE3537@eldon.alvh.no-ip.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Greg Stark wrote:
> On Fri, Jun 14, 2013 at 11:28 PM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
> > Re-summarization is relatively expensive, because the complete page range has
> > to be scanned.
>
> That doesn't sound too bad to me. It just means there's a downside to
> having larger page ranges. I would expect the page ranges to be
> something in the ballpark of 32 pages -- scanning 32 pages to
> resummarize doesn't sound that painful but sounds like it's large
> enough that the resulting index would be a reasonable size.
Actually, I'm thinking that a range is more like, say, 1280 pages, or 10
MB. My goal is to consider tables in the 10 TB magnitude; if I store
one index tuple for every 32 pages, I would end up with too large an
index. With 10 MBs per range I can index the whole table with an index
of 50 MB, which seems reasonable to scan. But anyway my intention is
that page range size is configurable.
> But I don't understand why an insert would invalid a tuple. An insert
> can just update the min and max incrementally. It's a delete that
> invalidates the range but as you note it doesn't really invalidate it,
> just mark it as needing a refresh -- and even then only if the value
> being deleted is equal to either the min or max.
No, I don't intend to update the index tuple with each heap insert. I
think this will end up being too slow. The validity map is there to
hold a single bit for each page saying whether the page range is known
valid or not; one insert into any page in the range invalidates the
range (and any scan of the table needs to scan that range as a whole).
This way I can wait until the storm of inserts has passed from a range,
and only then do the summarization for that range. This avoids having
to summarize the range over and over.
Alternatively, I could consider the index tuple always valid, and update
it online as soon as we do an insert or update (i.e. examine the min/max
values in the index tuple, and update it to match if the new value is
out of bounds). This seems too slow, so I won't try.
Also, a delete does not invalidate a range either. As Simon said
elsewhere in a thread, if the range is not minimal, this is not a
problem; we might have to scan some more ranges than absolutely
necessary, but it's not a correctness problem. The heap scan recheck
node will get rid of the unwanted tuples anyway.
> > Same-size page ranges?
> > Current related literature seems to consider that each "index entry" in a
> > minmax index must cover the same number of pages. There doesn't seem to be a
>
> I assume the reason for this in the literature is the need to quickly
> find the summary for a given page when you're handling an insert or
> delete.
Yeah, that's one thing to keep in mind. I haven't gotten too much into
this; I only added these two entries at the end for my own future
reference, because I will need to consider them at some point.
Right now my intention is to have each index have a fixed page range
size, which is defined at index creation time.
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2013-06-17 20:23:40 | Re: [RFC] Minmax indexes |
Previous Message | Andres Freund | 2013-06-17 20:00:05 | Re: Support for REINDEX CONCURRENTLY |