[WIP] [B-Tree] Retail IndexTuple deletion

From: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Subject: [WIP] [B-Tree] Retail IndexTuple deletion
Date: 2018-06-18 04:39:23
Message-ID: 425db134-8bba-005c-b59d-56e50de3b41e@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi,
I have written a code for quick indextuple deletion from an relation by
heap tuple TID. The code relate to "Retail IndexTuple deletion"
enhancement of btree index on postgresql wiki [1].
Briefly, it includes three steps:
1. Key generation for index tuple searching.
2. Index relation search for tuple with the heap tuple TID.
3. Deletion of the tuple from the index relation.

Now, index relation cleaning performs by vacuum which scan all index
relation for dead entries sequentially, tuple-by-tuple. This simplistic
and safe method can be significantly surpasses in the cases, than a
number of dead entries is not large by retail deletions which uses index
scans for search dead entries. Also, it can be used by distributed
systems for reduce cost of a global index vacuum.

Patch '0001-retail-indextuple-deletion' introduce new function
amtargetdelete() in access method interface. Patch
'0002-quick-vacuum-strategy' implements this function for an alternative
strategy of lazy index vacuum, called 'Quick Vacuum'.

The code demands hold DEAD tuple storage until scan key will be created.
In this version I add 'target_index_deletion_factor' option. If it more
than 0, heap_page_prune() uses ItemIdMarkDead() instead of
ItemIdSetDead() function for set DEAD flag and hold the tuple storage.
Next step is developing background worker which will collect pairs (tid,
scankey) of DEAD tuples from heap_page_prune() function.

Here are test description and some execution time measurements results
showing the benefit of this patches:

Test:
-----
create table test_range(id serial primary key, value integer);
insert into test_range (value) select random()*1e7/10^N from
generate_series(1, 1e7);
DELETE FROM test_range WHERE value=1;
VACUUM test_range;

Results:
--------

| n | t1, s | t2, s | speedup |
|---|---------------------------|
| 0 | 0.00003| 0.4476 | 1748.4 |
| 1 | 0.00006| 0.5367 | 855.99 |
| 2 | 0.0004 | 0.9804 | 233.99 |
| 3 | 0.0048 | 1.6493 | 34.576 |
| 4 | 0.5600 | 2.4854 | 4.4382 |
| 5 | 3.3300 | 3.9555 | 1.2012 |
| 6 | 17.700 | 5.6000 | 0.3164 |
|---|---------------------------|
in the table, t1 - measured execution time of lazy_vacuum_index()
function by Quick-Vacuum strategy; t2 - measured execution time of
lazy_vacuum_index() function by Lazy-Vacuum strategy;

Note, guaranteed allowable time of index scans (used for quick deletion)
will be achieved by storing equal-key index tuples in physical TID order
[2] with patch [3].

[1]
https://wiki.postgresql.org/wiki/Key_normalization#Retail_IndexTuple_deletion
[2]
https://wiki.postgresql.org/wiki/Key_normalization#Making_all_items_in_the_index_unique_by_treating_heap_TID_as_an_implicit_last_attribute
[3]
https://www.postgresql.org/message-id/CAH2-WzkVb0Kom%3DR%2B88fDFb%3DJSxZMFvbHVC6Mn9LJ2n%3DX%3DkS-Uw%40mail.gmail.com

--
Andrey Lepikhov
Postgres Professional:
https://postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-retail-indextuple-deletion.patch text/x-patch 12.0 KB
0002-quick-vacuum-strategy.patch text/x-patch 7.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-06-18 04:42:03 Re: Fix some error handling for read() and errno
Previous Message Michael Paquier 2018-06-18 04:29:30 Re: [bug fix] Cascaded standby cannot start after a clean shutdown