Re: Index Tuple Compression Approach?

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Dawid Kuroczko" <qnex42(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Index Tuple Compression Approach?
Date: 2007-08-15 20:54:17
Message-ID: 87bqd8y0w6.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:

> That general approach of storing a common part leading part just once is
> called prefix compression. Yeah, it helps a lot on long text fields.
> Tree structures like file paths in particular.

You kind of want to do avoid both the prefix and the suffix, no?

> It's been discussed before. One big problem is extracting the common
> leading part. You could only do it for text,

Or for multi-column indexes

I could see this being especially useful if you have some columns in the index
key which are small and some that are quite large. So if you have an event
table with an index on <userid,timestamp> you wouldn't have to store lots of
timestamps on the upper level tree nodes. You would only store them for the
leaf nodes.

> but it should be done in a datatype neutral way.

I wonder if there's an analogous operation for other data types though.
Numeric could store the a value relative to the parent value. Arrays could
store only the elements needed. bytea of course works just as well as text (or
better in the face of i18n).

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-08-15 22:06:10 Re: XID wraparound and busy databases
Previous Message Robert Treat 2007-08-15 20:54:02 Re: XID wraparound and busy databases