Re: Index AM change proposals, redux

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Index AM change proposals, redux
Date: 2008-04-24 11:13:09
Message-ID: 20080424111309.GE14647@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 24, 2008 at 10:11:02AM +0100, Simon Riggs wrote:
> Index compression is possible in many ways, depending upon the
> situation. All of the following sound similar at a high level, but each
> covers a different use case.

True, but there is one significant difference:

> * For Long, Similar data e.g. Text we can use Prefix Compression
> * For Highly Non-Unique Data we can use Duplicate Compression
> * Multi-Column Leading Value Compression - if you have a multi-column

These are all not lossy and so are candidate to use on any b-tree even
by default. They don't affect plan-construction materially, except
perhaps in cost calculations. Given the index tuple overhead I don't
see how you could lose.

> * For Unique/nearly-Unique indexes we can use Range Compression

This one is lossy and so does affect possible plans. I beleive the term
for this is "sparse" index. Also, it's seems harder to optimise, since
it would seem to me the index would have to have some idea on how
"close" values are together.

> As with HOT, all of these techniques need to be controlled by
> heuristics. Taken to extremes, these techniques can hurt other aspects
> of performance, so its important we don't just write the code but also
> investigate reasonable limits for behaviour also.

For the first three I can't imagine what the costs would be, except the
memory usage to store the unduplicated data once when its read in.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2008-04-24 11:28:21 Re: Batch update of indexes on data loading
Previous Message Simon Riggs 2008-04-24 10:59:37 Re: Index AM change proposals, redux