Re: Minmax indexes

From: Thom Brown <thom(at)linux(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Minmax indexes
Date: 2013-09-16 08:47:29
Message-ID: CAA-aLv6_v2QeFdS3xs28DnrV5joUU13H0eFYWg7Pax4ZoVPo6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 15 September 2013 01:14, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:

> Hi,
>
> Here's a reviewable version of what I've dubbed Minmax indexes. Some
> people said they would like to use some other name for this feature, but
> I have yet to hear usable ideas, so for now I will keep calling them
> this way. I'm open to proposals, but if you pick something that cannot
> be abbreviated "mm" I might have you prepare a rebased version which
> renames the files and structs.
>
> The implementation here has been simplified from what I originally
> proposed at 20130614222805(dot)GZ5491(at)eldon(dot)alvh(dot)no-ip(dot)org -- in particular,
> I noticed that there's no need to involve aggregate functions at all; we
> can just use inequality operators. So the pg_amproc entries are gone;
> only the pg_amop entries are necessary.
>
> I've somewhat punted on the question of doing resummarization separately
> from vacuuming. Right now, resummarization (as well as other necessary
> index cleanup) takes place in amvacuumcleanup. This is not optimal; I
> have stated elsewhere that I'd like to create separate maintenance
> actions that can be carried out by autovacuum. That would be useful
> both for Minmax indexes and GIN indexes (pending insertion list); maybe
> others. That's not part of this patch, however.
>
> The design of this stuff is in the file "minmax-proposal" at the top of
> the tree. That file is up to date, though it still contains some open
> questions that were present in the original proposal. (I have not fixed
> some bogosities pointed out by Noah, for instance. I will do that
> shortly.) In a final version, that file would be applied as
> src/backend/access/minmax/README, most likely.
>
> One area on which I needed to modify core code is IndexBuildHeapScan. I
> needed a version that was able to scan only a certain range of pages,
> not the entire table, so I introduced a new IndexBuildHeapRangeScan, and
> added a quick "heap_scansetlimits" function. I haven't tested that this
> works outside of the HeapRangeScan thingy, so it's probably completely
> bogus; I'm open to suggestions if people think this should be
> implemented differently. In any case, keeping that implementation
> together with vanilla IndexBuildHeapScan makes a lot of sense.
>
> One thing still to tackle is when to mark ranges as unsummarized. Right
> now, any new tuple on a page range would cause a new index entry to be
> created and a new revmap update. This would cause huge index bloat if,
> say, a page is emptied and vacuumed and filled with new tuples with
> increasing values outside the original range; each new tuple would
> create a new index tuple. I have two ideas about this (1. mark range as
> unsummarized if 3rd time we touch the same page range; 2. vacuum the
> affected index page if it's full, so we can maintain the index always up
> to date without causing unduly bloat), but I haven't implemented
> anything yet.
>
> The "amcostestimate" routine is completely bogus; right now it returns
> constant 0, meaning the index is always chosen if it exists.
>
> There are opclasses for int4, numeric and text. The latter doesn't work
> at all, because collation info is not passed down at all. I will have
> to figure that out (even if I find unlikely that minmax indexes have any
> usefulness on top of text columns). I admit that numeric hasn't been
> tested, and it's quite likely that they won't work; mainly because of
> lack of some datumCopy() calls, about which the code contains some
> /* XXX */ lines. I think this should be relatively straightforward.
> Ideally, the final version of this patch would contain opclasses for all
> supported datatypes (i.e. the same that have got btree opclasses).
>
> I have messed up the opclass information, as evidenced by failures in
> opr_sanity regression test. I will research that later.
>
>
> There's working contrib/pageinspect support; pg_xlogdump (and wal_debug)
> seems to work sanely too.
> This patch compiles cleanly under -Werror.
>
> The research leading to these results has received funding from the
> European Union's Seventh Framework Programme (FP7/2007-2013) under
> grant agreement n° 318633

Thanks for the patch, but I seem to have immediately hit a snag:

pgbench=# CREATE INDEX minmaxtest ON pgbench_accounts USING minmax (aid);
PANIC: invalid xlog record length 0

--
Thom

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marko Tiikkaja 2013-09-16 08:57:16 Re: plpgsql.print_strict_params
Previous Message Boszormenyi Zoltan 2013-09-16 08:39:52 Re: UNNEST with multiple args, and TABLE with multiple funcs