Re: Deleting older versions in unique indexes to avoid page splits

From: Victor Yegorov <vyegorov(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Deleting older versions in unique indexes to avoid page splits
Date: 2021-01-04 16:28:11
Message-ID: CAGnEbohqBDNX5kncK+X8BaOfLcFKMqUHiSNUOD0ofxCQzsOjdg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

чт, 31 дек. 2020 г. в 03:55, Peter Geoghegan <pg(at)bowt(dot)ie>:

> Attached is v12, which fixed bitrot against the master branch. This
> version has significant comment and documentation revisions. It is
> functionally equivalent to v11, though.
>
> I intend to commit the patch in the next couple of weeks. While it
> certainly would be nice to get a more thorough review, I don't feel
> that it is strictly necessary. The patch provides very significant
> benefits with certain workloads that have traditionally been
> considered an Achilles' heel for Postgres. Even zheap doesn't provide
> a solution to these problems. The only thing that I can think of that
> might reasonably be considered in competition with this design is
> WARM, which hasn't been under active development since 2017 (I assume
> that it has been abandoned by those involved).
>

I've looked through the v12 patch.

I like the new outline:

- _bt_delete_or_dedup_one_page() is the main entry for the new code
- first we try _bt_simpledel_pass() does improved cleanup of LP_DEAD entries
- then (if necessary) _bt_bottomupdel_pass() for bottomup deletion
- finally, we perform _bt_dedup_pass() to deduplication

We split the leaf page only if all the actions above failed to provide
enough space.

Some comments on the code.

v12-0001
--------

1. For the following comment

+ * Only do this for key columns. A change to a non-key column within an
+ * INCLUDE index should not be considered because that's just payload to
+ * the index (they're not unlike table TIDs to the index AM).

The last part of it (in the parenthesis) is difficult to grasp due to
the double negation (not unlike). I think it's better to rephrase it.

2. After reading the patch, I also think, that fact, that
index_unchanged_by_update()
and index_unchanged_by_update_var_walker() return different bool states
(i.e. when the latter returns true, the first one returns false) is a bit
misleading.

Although logic as it is looks fine, maybe
index_unchanged_by_update_var_walker()
can be renamed to avoid this confusion, to smth like
index_expression_changed_walker() ?

v12-0002
--------

1. Thanks for the comments, they're well made and do help to read the code.

2. I'm not sure the bottomup_delete_items parameter is very helpful. In
order to disable
bottom-up deletion, DBA needs somehow to measure it's impact on a
particular index.
Currently I do not see how to achieve this. Not sure if this is overly
important, though, as
you have a similar parameter for the deduplication.

3. It feels like indexUnchanged is better to make indexChanged and negate
its usage in the code.
As !indexChanged reads more natural than !indexUnchanged, at least to
me.

This is all I have. I agree, that this code is pretty close to being
committed.

Now for the tests.

First, I run a 2-hour long case with the same setup as I used in my e-mail
from 15 of November.
I found no difference between patch and master whatsoever. Which makes me
think, that current
master is quite good at keeping better bloat control (not sure if this is
an effect of
4228817449 commit or deduplication).

I created another setup (see attached testcases). Basically, I emulated
queue operations(INSERT at the end and DELETE

--
Victor Yegorov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2021-01-04 16:46:32 Re: set_config() documentation clarification
Previous Message Isaac Morland 2021-01-04 16:15:27 Re: Safety/validity of resetting permissions by updating system tables