Re: PG 13 release notes, first draft

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PG 13 release notes, first draft
Date: 2020-05-11 23:10:00
Message-ID: 20200511231000.GC4666@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 7, 2020 at 11:54:12AM -0700, Peter Geoghegan wrote:
> Hi Bruce,
>
> On Mon, May 4, 2020 at 8:16 PM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > I have committed the first draft of the PG 13 release notes. You can
> > see them here:
> >
> > https://momjian.us/pgsql_docs/release-13.html
>
> I see that you have an entry for the deduplication feature:
>
> "More efficiently store duplicates in btree indexes (Anastasia
> Lubennikova, Peter Geoghegan)"
>
> I would like to provide some input on this. Fortunately it's much
> easier to explain than the B-Tree work that went into Postgres 12. I
-----------------

Well, that's good! :-)

> think that you should point out that deduplication works by storing
> the duplicates in the obvious way: Only storing the key once per
> distinct value (or once per distinct combination of values in the case
> of multi-column indexes), followed by an array of TIDs (i.e. a posting
> list). Each TID points to a separate row in the table.

These are not details that should be in the release notes since the
internal representation is not important for its use.

> It won't be uncommon for this to make indexes as much as 3x smaller
> (it depends on a number of different factors that you can probably
> guess). I wrote a summary of how it works for power users in the
> B-Tree documentation chapter, which you might want to link to in the
> release notes:
>
> https://www.postgresql.org/docs/devel/btree-implementation.html#BTREE-DEDUPLICATION
>
> Users that pg_upgrade will have to REINDEX to actually use the
> feature, regardless of which version they've upgraded from. There are
> also some limited caveats about the data types that can use
> deduplication, and stuff like that -- see the documentation section I
> linked to.

I have added text to this about pg_upgrade:

Users upgrading with pg_upgrade will need to use REINDEX to make
use of this feature.

> Finally, you might want to note that the feature is enabled by
> default, and can be disabled by setting the "deduplicate_items" index
> storage option to "off". (We have yet to make a final decision on
> whether the feature should be enabled before the first stable release
> of Postgres 13, though -- I have an open item for that.)

Well, again, I don't think the average user needs to know this can be
disabled. They can look at the docs of this feature to see that.

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EnterpriseDB https://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2020-05-11 23:18:56 Re: PG 13 release notes, first draft
Previous Message Michail Nikolaev 2020-05-11 23:03:22 [PATCH] hs_standby_disallowed test fix