Bitmap Indexes: request for feedback

From: Gianni Ciolli <gianni(dot)ciolli(at)2ndquadrant(dot)it>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Gabriele Bartolini <gabriele(dot)bartolini(at)2ndquadrant(dot)it>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Bitmap Indexes: request for feedback
Date: 2008-10-21 14:57:59
Message-ID: 20081021145759.GA3349@fune
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi everybody,

me and Gabriele Bartolini have been working on Bitmap Indexes (BMI) in
the last weeks, with advice and guidance from Simon Riggs. We feel
that we are about to approach the point where it is appropriate to ask
for feedback from this list.

Thank you,
Dr. Gianni Ciolli - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gianni(dot)ciolli(at)2ndquadrant(dot)it | www.2ndquadrant.it

---8<------8<------8<------8<------8<------8<------8<------8<------8<---

First of all, let us describe what we have done so far, what we found
and how we think to proceed now.

== 1. Bringing the patch up to date ==

We started from Gavin Sherry's patch dating May 2007

http://archives.postgresql.org/pgsql-patches/2007-05/msg00013.php

As far as we could see, there has been no further activity on this
subject.

First, we brought the patch up to date. The latest version of the
patch was anterior to the new Index Access Method Interface

http://developer.postgresql.org/pgdocs/postgres/indexam.html

so we adapted the patch to that interface.

Then we added a few BMI page inspection functions to the pageinspect
contrib module, and we used them to examine the code. In addition to
finding and fixing a minor bug, we diagnosed an effect of HOT tuples
on the BMI patch, described below in greater detail. This also helped
us to produce extended descriptive documentation of how these indexes
work, and suggested us how to produce some more tests to verify that
(a newer version of) the BMI patch works; we are going to add some
regression tests especially targeted to HOT tuples.

After message

http://archives.postgresql.org/pgsql-hackers/2008-10/msg00855.php

maybe it is appropriate to mention that backwards scan would not be
supported at all by BMI indexes.

== 2. The effect of HOT tuples on BMI creation ==

The latest BMI patch mentioned above was also prior to the
introduction of HOT tuples.

Some parts of that patch rely on the assumption that
IndexBuildHeapScan scans tuples in increasing TID order. It is easy to
verify that this property is no longer valid after the introduction of
HOT tuples; however, a similar but weaker property still holds (the
scan is always done in non-decreasing block order).

This breaks some low-level bitmap vector build routines, which have to
be rewritten from scratch because they expect TIDs to came in
increasing order; but it does not harm block-level locking used in
that patch.

== 3. What we would do after ==

We understand that BMI development was suspended because of lack of
time from the last developer, during the improvement of the VACUUM
phase. The main obstacle was that the physical size of a compressed
bitmap vector can either grow or shrink, possibly creating new BMV
pages, which can mean bad performance.

The current VACUUM algorithm is unfinished; we are going to examine
it, looking for some improvements, and to measure the current status
with some ad-hoc benchmarks.

== 4. Timeline ==

Up to now, we spent many days to isolate, describe and partially fix
the incompatibilies described above; now we feel that points 1. and
2. can be cleared in a couple of days, bringing the patch up to date
with current HEAD.

As for the remaining part, we expect to finish the patch before the
deadline for the latest CommitFest.

We will re-post the patch as soon as the HOT tuples will be working;
then we will post a new version the patch when also VACUUM will be
done.

Does anybody have any comments and/or additional requests?

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2008-10-21 15:16:30 Re: SSL cleanups/hostname verification
Previous Message Tom Lane 2008-10-21 14:25:50 Re: automatic parser generation for ecpg