Re: Minmax indexes

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Jim Nasby <jim(at)nasby(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Minmax indexes
Date: 2013-09-30 11:07:42
Message-ID: 52495B7E.8080905@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27.09.2013 21:43, Greg Stark wrote:
> On Fri, Sep 27, 2013 at 7:22 PM, Jim Nasby<jim(at)nasby(dot)net> wrote:
>>
>> Yeah, we obviously kept things simpler when adding forks in order to get the feature out the door. There's improvements that need to be made. But IMHO that's not reason to automatically avoid forks; we need to consider the cost of improving them vs what we gain by using them.
>
> We think this gives short change to the decision to introduce forks.
> If you go back to the discussion at the time it was a topic of debate
> and the argument which won the day is that interleaving different
> streams of data in one storage system is exactly what the file system
> is designed to do and we would just be reinventing the wheel if we
> tried to do it ourselves. I think that makes a lot of sense for things
> like the fsm or vm which grow indefinitely and are maintained by a
> different piece of code from the main heap.
>
> The tradeoff might be somewhat different for the pieces of a data
> structure like a bitmap index or gin index where the code responsible
> for maintaining it is all the same.

There are quite a dfew cases where we have several "streams" of data,
all related to a single relation. We've solved them all in slightly
different ways:

1. TOAST. A separate heap relation with accompanying b-tree index is
created.

2. GIN. GIN contains a b-tree, and data pages (and somer other kinds of
pages too IIRC). It would be natural to use the regular B-tree code for
the B-tree, but instead it contains a completely separate
implementation. All the different kinds of streams are stored in the
main fork.

3. Free space map. Stored as a separate fork.

4. Visibility map. Stored as a separate fork.

And upcoming:

5. Minmax indexes, with the linearly-addressed range reverse map and
variable lenghth index tuples.

6. Bitmap indexes. Like in GIN, there's a B-tree and the data pages
containing the bitmaps.

A nice property of the VM and FSM forks currently is that they are just
auxiliary information to speed things up. You can safely remove them
(when the server is shut down), and the system will recreate them on
next vacuum. It's not carved in stone that it has to be that way for all
extra forks, but it is today and I like it.

I feel we need a new kind of a relation fork, something more
heavy-weight than the current forks, but not as heavy-weight as the way
TOAST does it. It would be nice if GIN and bitmap indexes could use the
regular nbtree code. Or any other index type - imagine a bitmap index
using a SP-GiST index instead of a B-tree! You could create a bitmap
index for 2d points, and use it to speed up operations like overlap for
example.

The nbtree code expects the data to be in the main fork and uses the FSM
fork too. Maybe it could be abstracted, so that the regular b-tree could
be used as part of another index type. Same with other indexams.

Perhaps relation forks need to be made more flexible, allowing access
methods to define what forks exists. IOW, let's not avoid using relation
forks, let's make them better instead.

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2013-09-30 11:17:39 Re: Minmax indexes
Previous Message Ian Lawrence Barwick 2013-09-30 10:49:00 Re: review: psql and pset without any arguments