Re: [RFC] Minmax indexes

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Jim Nasby <jim(at)nasby(dot)net>
Cc: Claudio Freire <klaussfreire(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] Minmax indexes
Date: 2013-06-28 17:56:05
Message-ID: 51CDCE35.7070203@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


>> On Fri, Jun 28, 2013 at 2:18 PM, Jim Nasby <jim(at)nasby(dot)net> wrote:
>>> If we add a dirty flag it would probably be wise to allow for more
>>> than one
>>> value so we can do a clock-sweep. That would allow for detecting a range
>>> that is getting dirtied repeatedly and not bother to try and
>>> re-summarize it
>>> until later.

Given that I'm talking about doing the resummarization at VACUUM time, I
don't think that's justified. That only makes sense if you're doing the
below ...

>>> Something else I don't think was mentioned... re-summarization should be
>>> somehow tied to access activity: if a query will need to seqscan a
>>> segment
>>> that needs to be summarized, we should take that opportunity to
>>> summarize at
>>> the same time while pages are in cache. Maybe that can be done in the
>>> backend itself; maybe we'd want a separate process.

Writing at SELECT time is something our users hate, with significant
justification. This suggestion is possibly a useful optimization, but
I'd like to see the simplest possible version of minmax indexes go into
9.4, so that we can see how they work in practice before we start adding
complicated performance optimizations involving extra write overhead and
background workers. We're more likely to engineer an expensive
de-optimization that way.

I wouldn't be unhappy to have the very basic minmax indexes -- the one
where we just flag pages as invalid if they get updated -- go into 9.4.
That would still work for large, WORM tables, which are a significant
and pervasive use case. If Alvaro gets to it, I think the "updating the
range, rescan at VACUUM time" approach isn't much more complex, but if
he doesn't I'd still vote to accept the feature.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2013-06-28 18:27:01 Re: Documentation/help for materialized and recursive views
Previous Message Alvaro Herrera 2013-06-28 17:39:17 Re: Documentation/help for materialized and recursive views