Re: Question about indexes

From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Question about indexes
Date: 2004-01-30 22:43:54
Message-ID: 1075502634.4007.32.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane kirjutas R, 30.01.2004 kell 16:48:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Also, what does an in-memory bitmapped index look like?
>
> One idea that might work: a binary search tree in which each node
> represents a single page of the table, and contains a bit array with
> one bit for each possible item number on the page. You could not need
> more than BLCKSZ/(sizeof(HeapTupleHeaderData)+sizeof(ItemIdData)) bits
> in a node, or about 36 bytes at default BLCKSZ --- for most tables you
> could probably prove it would be a great deal less. You only allocate
> nodes for pages that have at least one interesting row.

Another idea would be using bitmaps where we have just one bit per
database page and do a seq scan but just over marked pages.

Even when allocating them in full such indexes would occupy just
1/(8k*8bit) of the amount they describe, so index for 1GB table would be
1G/(8k*8bit) = 16 kilobytes (2 pages)

Also, such indexes, if persistent, could also be used (together with
FSM) when deciding placement of new tuples, so they provide a form of
clustering.

This would of course be most useful for data-warehouse type operations,
where database is significanty bigger than memory.

And the seqscan over bitmap should not be done in simple page order, but
rather in two passes -
1. over those pages which are already in cache (either postgresqls
or systems (if we find a way to get such info from the system))
2. in sequential order over the rest.

> I think this would represent a reasonable compromise between size and
> insertion speed. It would only get large if the indexscan output
> demanded visiting many different pages --- but at some point you could
> abandon index usage and do a sequential scan, so I think that property
> is okay.

One case where almost full intermediate bitmap could be needed is when
doing a star join or just AND of several conditions, where each single
index spans a significant part of the table, but the result does not.

> A variant is to make the per-page bit arrays be entries in a hash table
> with page number as hash key. This would reduce insertion to a nearly
> constant-time operation, but the drawback is that you'd need an explicit
> sort at the end to put the per-page entries into page number order
> before you scan 'em. You might come out ahead anyway, not sure.
>
> Or we could try a true linear bitmap (indexed by page number times
> max-items-per-page plus item number) that's compressed in some fashion,
> probably just by eliminating large runs of zeroes. The difficulty here
> is that inserting a new one-bit could be pretty expensive, and we need
> it to be cheap.
>
> Perhaps someone can come up with other better ideas ...

I have also contemplated a scenario, where we could use some
not-quite-max power-of-2 bits-per-page linear bitmap and mark intra-page
wraps (when we tried to mark a point past that not-quite-max number in a
page) in high bit (or another bitmap) making info for that page folded.
AN example would be setting bit 40 in 32-bits/page index - this would
set bit 40&31 and mark the page folded.

When combining such indexes using AND or OR, we need some spcial
handling of folded pages, but could still get non-folded (0) results out
from AND of 2 folded pages if the bits are distributed nicely.

--------------
Hannu

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2004-01-30 22:48:03 Re: returning PGresult as xml
Previous Message Scott Lamb 2004-01-30 20:08:52 Re: Mixing threaded and non-threaded