From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Josh Berkus <josh(at)agliodbs(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [RFC] Minmax indexes |
Date: | 2013-06-15 15:15:06 |
Message-ID: | CA+U5nMKL2h6-fXHTJix_YEktFKjDOXOTnD5=UtDF8qSoVpqmzQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 15 June 2013 00:01, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Alvaro,
>
> This sounds really interesting, and I can see the possibilities.
> However ...
>
>> Value changes in columns that are part of a minmax index, and tuple insertion
>> in summarized pages, would invalidate the stored min/max values. To support
>> this, each minmax index has a validity map; a range can only be considered in a
>> scan if it hasn't been invalidated by such changes (A range "not considered" in
>> the scan needs to be returned in whole regardless of the stored min/max values,
>> that is, it cannot be pruned per query quals). The validity map is very
>> similar to the visibility map in terms of performance characteristics: quick
>> enough that it's not contentious, allowing updates and insertions to proceed
>> even when data values violate the minmax index conditions. An invalidated
>> range can be made valid by re-summarization (see below).
>
> This begins to sound like these indexes are only useful on append-only
> tables. Not that there aren't plenty of those, but ...
The index is basically using the "index only scan" mechanism. The
"only useful on append-only tables" comment would/should apply also to
index only scans. I can't see a reason to raise that specifically for
this index type.
>> Re-summarization is relatively expensive, because the complete page range has
>> to be scanned.
>
> Why? Why can't we just update the affected pages in the index?
Again, same thing as index-only scans. For IOS, we reset the
visibility info at vacuum. The route proposed here follows exactly the
same timing, same mechanism. I can't see a reason for any difference
between the two.
>> To avoid this, a table having a minmax index would be
>> configured so that inserts only go to the page(s) at the end of the table; this
>> avoids frequent invalidation of ranges in the middle of the table. We provide
>> a table reloption that tweaks the FSM behavior, so that summarized pages are
>> not candidates for insertion.
>
> We haven't had an index type which modifies table insertion behavior
> before, and I'm not keen to start now; imagine having two indexes on the
> same table each with their own, conflicting, requirements. This is
> sounding a lot more like a candidate for our prospective pluggable
> storage manager. Also, the above doesn't help us at all with UPDATEs.
>
> If we're going to start adding reloptions for specific table behavior,
> I'd rather think of all of the optimizations we might have for a
> prospective "append-only table" and bundle those, rather than tying it
> to whether a certain index exists or not.
I agree that the FSM behaviour shouldn't be linked to index existence.
IMHO that should be a separate table parameter, WITH (fsm_mode = append)
Index only scans would also benefit from that.
> Also, I hate the name ... if this feature goes ahead, I'm going to be
> lobbying to change it. But that's pretty minor compared to the update
> issues.
This feature has already had 3 different names. I don't think the name
is crucial, but it makes sense to give it a name up front. So if you
want to lobby for that then you'd need to come up with a name soon, so
poor Alvaro can cope with name #4.
(There's no consistency in naming from any other implementation either).
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2013-06-15 15:29:45 | Re: stray SIGALRM |
Previous Message | Andres Freund | 2013-06-15 15:08:34 | Re: stray SIGALRM |