Re: proposal for smaller indexes on index-ordered tables

From: Zoltan Boszormenyi <zb(at)cybertec(dot)at>
To: Jeffrey Baker <jwbaker(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: proposal for smaller indexes on index-ordered tables
Date: 2008-06-24 20:59:48
Message-ID: 48616044.1000700@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeffrey Baker írta:
> The way I read it, the current btree index stores the index value and
> the TID of every tuple having that value. When you have a table with
> three columns, you index one of them and you get an index which is
> practically as large as the table itself.
>
> Supposing the table is generally or strictly ordered by the column to
> be indexed, it would be more compact if the index stored ranges of
> tuples. Instead of storing the TID of every tuple with that value,
> the index would store a first and last TID, between which all tuples
> have the value.
>
> Example: table with one million rows indexed on a column having one
> thousand distinct values. Table is in-order by the indexed column.
> The traditional index would contain a million TIDs, whereas a range
> index would contain only two thousand. The range index would be 500
> times smaller, more likely to be cached, etc.
>
> Thoughts?
>
> -jwb

Example with your theory:
One (not yet committed) transaction changes one tuple
that was in the middle of a range before but the tuple's indexed
column changed. What would you do?
You need to keep track of multiple index versions:
1. the range has to be split for the not-yet-committed modifier transaction,
it might need to re-read the same table.
2. the old range has to be kept for reader transactions that still see
the old data
Imagine you have thousands of UPDATEs in flight on different rows.

Or you introduce "readers has to wait for writers locks" and
"updaters has to wait for other updaters on the same range"
that the MVCC implementation nicely avoids.

Look at MaxDB once, you'll appreciate PostgreSQL then.
MaxDB stores table tuples in the order of its primary key,
it uses a balanced btree for that. This means slower INSERTs and
UPDATEs and decreased concurrency compared to PostgreSQL.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeffrey Baker 2008-06-24 21:08:43 Re: proposal for smaller indexes on index-ordered tables
Previous Message daveg 2008-06-24 20:53:58 Re: [HACKERS] Patch for Prevent pg_dump/pg_restore from being affected by statement_timeout