Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Date: 2019-11-19 01:26:37
Message-ID: CAH2-WzmSMmU2eNvY9+a4MNP+z02h6sa-uxZvN3un6jY02ZVBSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 15, 2019 at 5:02 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> What I saw suggests that we will need to remove the new "postingoff"
> field from xl_btree_insert. (We can create a new XLog record for leaf
> page inserts that also need to split a posting list, without changing
> much else.)

Attached is v24. This revision doesn't fix the problem with
xl_btree_insert record bloat, but it does fix the bitrot against the
master branch that was caused by commit 50d22de9. (This patch has had
a surprisingly large number of conflicts against the master branch
recently.)

Other changes:

* The pageinspect patch has been cleaned up. I now propose that it be
committed alongside the main patch.

The big change here is that posting lists are represented as an array
of TIDs within bt_page_items(), much like gin_leafpage_items(). Also
added documentation that goes into the ways in which ctid can be used
to encode information (arguably some of this should have been included
with the Postgres 12 B-Tree work).

* Basic tests that cover deduplication within unique indexes. We ought
to have code coverage of the case where _bt_check_unique() has to step
right (actually, we don't have that on the master branch either).

--
Peter Geoghegan

Attachment Content-Type Size
v24-0002-Teach-pageinspect-about-nbtree-posting-lists.patch application/octet-stream 16.5 KB
v24-0001-Add-deduplication-to-nbtree.patch application/octet-stream 179.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2019-11-19 01:57:39 Re: progress report for ANALYZE
Previous Message Craig Ringer 2019-11-19 00:40:39 Re: PITR on DROP DATABASE, deleting of the database directory despite the recovery_target_time set before.