Quick Links

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Peter Geoghegan <pg(at)bowt(dot)ie>
Cc:	Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Date:	2019-11-15 13:16:22
Message-ID:	CA+TgmoZ3_SJ8+RTL+6KV1V+RRucaW7UXa4Z4YFmVT4xQkhjhRw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Nov 13, 2019 at 2:51 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> "Deduplication" never means that you get rid of duplicates. According
> to Wikipedia's deduplication article: "Whereas compression algorithms
> identify redundant data inside individual files and encodes this
> redundant data more efficiently, the intent of deduplication is to
> inspect large volumes of data and identify large sections – such as
> entire files or large sections of files – that are identical, and
> replace them with a shared copy".

Hmm. Well, maybe I'm just behind the times. But that same wikipedia
article also says that deduplication works on large chunks "such as
entire files or large sections of files" thus differentiating it from
compression algorithms which work on the byte level, so it seems to me
that what you are doing still sounds more like ad-hoc compression.

> Can you suggest an alternative?

My instinct is to pick a name that somehow involves compression and
just put enough other words in there to make it clear e.g. duplicate
value compression, or something of that sort.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. at 2019-11-13 19:51:18 from Peter Geoghegan

Responses

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. at 2019-11-16 00:04:44 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2019-11-15 13:30:21	Re: Creating foreign key on partitioned table is too slow
Previous Message	Ranier Vilela	2019-11-15 12:24:04	RE: [PATCH][BUG FIX] Unsafe access pointers.