Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Date: 2019-12-13 02:21:20
Message-ID: CAH2-WzmqHsBW6t3TwDRygbzgMJFtukNwNcX9B+w_OrYXbftaPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 3, 2019 at 12:13 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> The new criteria/heuristic for unique indexes is very simple: If a
> unique index has an existing item that is a duplicate on the incoming
> item at the point that we might have to split the page, then apply
> deduplication. Otherwise (when the incoming item has no duplicates),
> don't apply deduplication at all -- just accept that we'll have to
> split the page. We already cache the bounds of our initial binary
> search in insert state, so we can reuse that information within
> _bt_findinsertloc() when considering deduplication in unique indexes.

Attached is v26, which adds this new criteria/heuristic for unique
indexes. We now seem to consistently get good results with unique
indexes.

Other changes:

* A commit message is now included for the main patch/commit.

* The btree_deduplication GUC is now a boolean, since it is no longer
up to the user to indicate when deduplication is appropriate in unique
indexes (the new heuristic does that instead). The GUC now only
affects non-unique indexes.

* Simplified the user docs. They now only mention deduplication of
unique indexes in passing, in line with the general idea that
deduplication in unique indexes is an internal optimization.

* Fixed bug that made backwards scans that touch posting lists fail to
set LP_DEAD bits when that was possible (i.e. the kill_prior_tuple
optimization wasn't always applied there with posting lists, for no
good reason). Also documented the assumptions made by the new code in
_bt_readpage()/_bt_killitems() -- if that was clearer in the first
place, then the LP_DEAD/kill_prior_tuple bug might never have
happened.

* Fixed some memory leaks in nbtree VACUUM.

Still waiting for some review of the first patch, to get it out of the
way. Anastasia?

--
Peter Geoghegan

Attachment Content-Type Size
v26-0001-Remove-dead-pin-scan-code-from-nbtree-VACUUM.patch application/octet-stream 17.3 KB
v26-0003-Teach-pageinspect-about-nbtree-posting-lists.patch application/octet-stream 16.5 KB
v26-0004-DEBUG-Show-index-values-in-pageinspect.patch application/octet-stream 4.4 KB
v26-0002-Add-deduplication-to-nbtree.patch application/octet-stream 196.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2019-12-13 02:23:27 Re: Wrong assert in TransactionGroupUpdateXidStatus
Previous Message Justin Pryzby 2019-12-13 02:13:22 Re: shared tempfile was not removed on statement_timeout (unreproducible)