Re: Deleting older versions in unique indexes to avoid page splits

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Victor Yegorov <vyegorov(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Deleting older versions in unique indexes to avoid page splits
Date: 2020-10-16 20:58:01
Message-ID: CAH2-WzmzGO7j2wCesRvyqZNL4XTc8896W=W0wX8bX1vX3XGPag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 16, 2020 at 1:00 PM Victor Yegorov <vyegorov(at)gmail(dot)com> wrote:
> I really like these results, great work!

Thanks Victor!

> I'm also wondering how IO numbers changed due to these improvements, shouldn't be difficult to look into.

Here is the pg_statio_user_indexes for patch for the same run:

schemaname | relname | indexrelname |
idx_blks_read | idx_blks_hit
------------+------------------+---------------------------+---------------+---------------
public | pgbench_accounts | aid_pkey_include_abalance |
12,828,736 | 534,819,826
public | pgbench_accounts | one |
12,750,275 | 534,486,742
public | pgbench_accounts | two |
2,474,893 | 2,216,047,568
(3 rows)

And for master:

schemaname | relname | indexrelname |
idx_blks_read | idx_blks_hit
------------+------------------+---------------------------+---------------+---------------
public | pgbench_accounts | aid_pkey_include_abalance |
29,526,568 | 292,705,432
public | pgbench_accounts | one |
28,239,187 | 293,164,160
public | pgbench_accounts | two |
6,505,615 | 1,318,164,692
(3 rows)

Here is pg_statio_user_tables patch:

schemaname | relname | heap_blks_read | heap_blks_hit |
idx_blks_read | idx_blks_hit | toast_blks_read | toast_blks_hit |
tidx_blks_read | tidx_blks_hit
------------+------------------+----------------+---------------+---------------+---------------+-----------------+----------------+----------------+---------------
public | pgbench_accounts | 123,195,496 | 696,805,485 |
28,053,904 | 3,285,354,136 | | |
|
public | pgbench_branches | 11 | 1,553 |
| | | |
|
public | pgbench_history | 0 | 0 |
| | | |
|
public | pgbench_tellers | 86 | 15,416 |
| | | |
|
(4 rows)

And the pg_statio_user_tables for master:

schemaname | relname | heap_blks_read | heap_blks_hit |
idx_blks_read | idx_blks_hit | toast_blks_read | toast_blks_hit |
tidx_blks_read | tidx_blks_hit
------------+------------------+----------------+---------------+---------------+---------------+-----------------+----------------+----------------+---------------
public | pgbench_accounts | 106,502,089 | 334,875,058 |
64,271,370 | 1,904,034,284 | | |
|
public | pgbench_branches | 11 | 1,553 |
| | | |
|
public | pgbench_history | 0 | 0 |
| | | |
|
public | pgbench_tellers | 86 | 15,416 |
| | | |
|
(4 rows)

Of course, it isn't fair to make a direct comparison because we're
doing ~1.7x times more work with the patch. But even still, the
idx_blks_read is less than half with the patch.

BTW, the extra heap_blks_hit from the patch are not only due to the
fact that the system does more directly useful work. It's also due to
the extra garbage collection triggered in indexes. The same is
probably *not* true with heap_blks_read, though. I minimize the number
of heap pages accessed by the new cleamup mechanism each time, and
temporal locality will help a lot. I think that we delete index
entries pointing to garbage in the heap at pretty predictable
intervals. Heap pages full of LP_DEAD line pointer garbage only get
processed with a few times close together in time, after which they're
bound to either get VACUUM'd or get evicted from shared buffers.

> Peter, according to cfbot patch no longer compiles.
> Can you send and update, please?

Attached is v3, which is rebased against the master branch as of
today. No real changes, though.

--
Peter Geoghegan

Attachment Content-Type Size
v3-0001-Add-delete-deduplication-to-nbtree.patch application/octet-stream 57.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-10-16 21:04:56 Re: upcoming API changes for LLVM 12
Previous Message Tom Lane 2020-10-16 20:56:47 Re: Internal key management system