Re: Index Tuple Compression Approach?

From: Chris Browne <cbbrowne(at)acm(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Index Tuple Compression Approach?
Date: 2007-08-16 14:50:36
Message-ID: 608x8bpm83.fsf@dba2.int.libertyrms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

stark(at)enterprisedb(dot)com (Gregory Stark) writes:

> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
>
>> Gregory Stark wrote:
>>> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
>>>
>>>> That general approach of storing a common part leading part just once is
>>>> called prefix compression. Yeah, it helps a lot on long text fields.
>>>> Tree structures like file paths in particular.
>>>
>>> You kind of want to do avoid both the prefix and the suffix, no?
>>
>> You're much more likely to find common prefixes than suffixes in an
>> index page, because of the ordering. I suppose compressing the suffix
>> would be useful in some cases as well. You might be better off with some
>> generic compression algorithm at that point, though.
>
> Sorry, by "suffix" I don't mean common sufixes, I mean the bits of the key
> following the point which discriminates between the left and right side of the
> tree.
>
> So for example if you're indexing a text field and have a
> tree structure like:
>
> Redhat Fedora Core 7
> / \
> Debian Etch (Unstable) Ubuntu hoary
>
> We don't really need the whole of "Redhat Fedora Core 7" in the index node. We
> could actually get by with just "R". Everything before "R" is on the left and
> everything after "R" is on the right.

Right. The case where you get more characters than just "R" is when
you introduce extra entries that have the same prefix. Say, "Redhat
Fedora Core 4", "Redhat Fedora Core 5", "Redhat Fedora Core 6". And,
for good measure, let's throw in "Redhat RHAS 3", "Redhat RHAS 4", and
"Redhat RHAS 5".

In that case, you'd have substrings:
"R"
"edhat "
"Fedora Core "
"RHAS "
as discriminators.
--
"cbbrowne","@","cbbrowne.com"
http://www3.sympatico.ca/cbbrowne/spreadsheets.html
If a mute swears, does his mother wash his hands with soap?

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Darcy Buskermolen 2007-08-16 15:24:14 Re: build farm failures
Previous Message Florian G. Pflug 2007-08-16 14:13:55 Re: XID wraparound and busy databases