Re: Deleting older versions in unique indexes to avoid page splits

From: Victor Yegorov <vyegorov(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Deleting older versions in unique indexes to avoid page splits
Date: 2020-11-15 22:29:08
Message-ID: CAGnEbohYF_K6b0v=2uc289=v67qNhc3n01Ftic8X94zP7kKqtw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

пт, 13 нояб. 2020 г. в 00:01, Peter Geoghegan <pg(at)bowt(dot)ie>:

> On Wed, Nov 11, 2020 at 12:58 PM Victor Yegorov <vyegorov(at)gmail(dot)com>
> wrote:
> > On the other hand, there's quite a big drop on the UPDATEs throughput.
> For sure, undersized shared_bufefrs
> > contribute to this drop. Still, my experience tells me that under
> conditions at hand (disabled HOT due to index
> > over update time column) tables will tend to accumulate bloat and
> produce unnecessary IO also from WAL.
>
> I think that the big SELECT statement with an "ORDER BY mtime ... "
> was a good way of demonstrating the advantages of the patch.
>
> Attached is v8, which has the enhancements for low cardinality data
> that I mentioned earlier today. It also simplifies the logic for
> dealing with posting lists that we need to delete some TIDs from.
> These posting list simplifications also make the code a bit more
> efficient, which might be noticeable during benchmarking.
>
> Perhaps your "we have 5,2% slowdown in UPDATE speed" issue will be at
> least somewhat fixed by the enhancements to v8?
>

Yes, v8 looks very nice!

I've done two 8 hour long sessions with scale=2000 and shared_buffers=512MB
(previously sent postgresql.auto.conf used here with no changes).
The rest of the setup is the same:
- mtime column that is tracks update time
- index on (mtime, aid)
- tenner low cardinality index from Peter's earlier e-mail
- 3 pgbench scripts run in parallel on master and on v8 patchset (scripts
from the previous e-mail used here).

Master
------
relname | nrows | blk_before | mb_before | blk_after |
mb_after | diff
-----------------------+-----------+------------+-----------+-----------+----------+--------
pgbench_accounts | 300000000 | 4918033 | 38422.1 | 5066589 |
39582.7 | +3.0%
accounts_mtime | 300000000 | 1155119 | 9024.4 | 1422354 |
11112.1 | +23.1%
pgbench_accounts_pkey | 300000000 | 822573 | 6426.4 | 822573 |
6426.4 | 0
tenner | 300000000 | 346050 | 2703.5 | 563101 |
4399.2 | +62.7%
(4 rows)

DB size: 59.3..64.5 (+5.2GB / +8.8%)

Patched
-------
relname | nrows | blk_before | mb_before | blk_after |
mb_after | diff
-----------------------+-----------+------------+-----------+-----------+----------+--------
pgbench_accounts | 300000000 | 4918033 | 38422.1 | 5068092 |
39594.5 | +3.0%
accounts_mtime | 300000000 | 1155119 | 9024.4 | 1428972 |
11163.8 | +23.7%
pgbench_accounts_pkey | 300000000 | 822573 | 6426.4 | 822573 |
6426.4 | 0
tenner | 300000000 | 346050 | 2703.5 | 346050 |
2703.5 | 0
(4 rows)

DB size: 59.3..62.8 (+3.5GB / +5.9%)

TPS
---
query | Master TPS | Patched TPS | diff
----------------+------------+-------------+-------
UPDATE + SELECT | 2413 | 2473 | +2.5%
3 SELECT in txn | 19737 | 19545 | -0.9%
15min SELECT | 0.74 | 1.03 | +39%

Based on the figures and also on the graphs attached, I can tell v8 has no
visible regression
in terms of TPS, IO pattern changes slightly, but the end result is worth
it.
In my view, this patch can be applied from a performance POV.

I wanted to share these before I'll finish with the code review, I'm
planning to send it tomorrow.

--
Victor Yegorov

Attachment Content-Type Size
20201114-results-master.txt text/plain 1.9 KB
20201114-q1-UPDATE-master.png image/png 408.5 KB
20201114-q2-SELECT-master.png image/png 340.5 KB
20201114-q3-15min-SELECT-master.png image/png 345.5 KB
20201114-overview-master.png image/png 526.2 KB
20201114-q1-UPDATE-v8.png image/png 381.6 KB
20201114-q2-SELECT-v8.png image/png 312.3 KB
20201114-q3-15min-SELECT-v8.png image/png 344.9 KB
20201114-overview-v8.png image/png 519.6 KB
20201114-results-patched.txt text/plain 1.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2020-11-15 23:13:15 Re: Supporting = operator in gin/gist_trgm_ops
Previous Message Jürgen Purtz 2020-11-15 18:45:35 Re: Additional Chapter for Tutorial