Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Юрий Соколов <funny(dot)falcon(at)gmail(dot)com>
Subject: Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Date: 2020-02-20 01:08:10
Message-ID: CAH2-WzmxczknR1eQcT+PGAYkD-wsTD08JcDr+fnUZ2VeyyW6LQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 19, 2020 at 11:16 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Wed, Feb 19, 2020 at 8:14 AM Anastasia Lubennikova
> <a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
> > Thank you for this work. I've looked through the patches and they seem
> > to be ready for commit.
> > I haven't yet read recent documentation and readme changes, so maybe
> > I'll send some more feedback tomorrow.
>
> Great.

I should add: I plan to commit the patch within the next 7 days.

I believe that the design of deduplication itself is solid; it has
many more strengths than weaknesses. It works in a way that
complements the existing approach to page splits. The optimization can
easily be turned off (and easily turned back on again).
contrib/amcheck can detect almost any possible form of corruption that
could affect a B-Tree index that has posting list tuples. I have spent
months microbenchmarking every little aspect of this patch in
isolation. I've also spent a lot of time on conventional benchmarking.

It seems quite possible that somebody won't like some aspect of the
user interface. I am more than willing to work with other contributors
on any issue in that area that comes to light. I don't see any point
in waiting for other hackers to speak up before the patch is
committed, though. Anastasia posted the first version of this patch in
August of 2015, and there have been over 30 revisions of it since the
project was revived in 2019. Everyone has been given ample opportunity
to offer input.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Soumyadeep Chakraborty 2020-02-20 01:17:57 Re: WIP: expression evaluation improvements
Previous Message David Fetter 2020-02-19 23:41:58 Re: Parallel copy