Re: [PATCH] Speedup truncates of relation forks

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: "Jamison, Kirk" <k(dot)jamison(at)jp(dot)fujitsu(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Speedup truncates of relation forks
Date: 2019-06-11 23:09:00
Message-ID: 20190611230900.ctj4sz5zflzdxkqr@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 11, 2019 at 07:34:35AM +0000, Jamison, Kirk wrote:
>Hi all,
>Attached is a patch to speed up the performance of truncates of relations.
>This is also my first time to contribute my own patch,
>and I'd gladly appreciate your feedback and advice.

Thanks for the patch. Please add it to the commitfest app, so that we
don't forget about it:

>A. Summary
>Whenever we truncate relations, it scans the shared buffers thrice
>(one per fork) which can be time-consuming. This patch improves
>the performance of relation truncates by initially marking the
>pages-to-be-truncated of relation forks, then simultaneously
>truncating them, resulting to an improved performance in VACUUM,
>autovacuum operations and their recovery performance.

OK, so essentially the whole point is to scan the buffers only once, for
all forks at the same time (instead of three times).

>B. Patch Details
>The following functions were modified:
>1. FreeSpaceMapTruncateRel() and visibilitymap_truncate()
>a. CURRENT HEAD: These functions truncate the FSM pages and unused VM pages.
>b. PATCH: Both functions only mark the pages to truncate and return a block number.
>- We used to call smgrtruncate() in these functions, but these are now moved inside the RelationTruncate() and smgr_redo().
>- The tentative renaming of the functions are: MarkFreeSpaceMapTruncateRel() and visibilitymap_mark_truncate(). Feel free to suggest better names.
>2. RelationTruncate()
>a. HEAD: Truncate FSM and VM first, then write WAL, and lastly truncate main fork.
>b. PATCH: Now we mark FSM and VM pages first, write WAL, mark MAIN fork pages, then truncate all forks (MAIN, FSM, VM) simultaneously.
>3. smgr_redo()
>a. HEAD: Truncate main fork and the relation during XLOG replay, create fake rel cache for FSM and VM, truncate FSM, truncate VM, then free fake rel cache.
>b. PATCH: Mark main fork dirty buffers, create fake rel cache, mark fsm and vm buffers, truncate marked pages of relation forks simultaneously, truncate relation during XLOG replay, then free fake rel cache.
>4. smgrtruncate(), DropRelFileNodeBuffers()
>- input arguments are changed to array of forknum and block numbers, int nforks (size of forkNum array)
>- truncates the pages of relation forks simultaneously
>5. smgrdounlinkfork()
>I modified the function because it calls DropRelFileNodeBuffers. However, this is a dead code that can be removed.
>I did not remove it for now because that's not for me but the community to decide.

You really don't need to extract the changes like this - such changes
are generally obvious from the diff.

You only need to explain things that are not obvious from the code
itself, e.g. non-trivial design decisions, etc.

>C. Performance Test
>I setup a synchronous streaming replication between a master-standby.
>In postgresql.conf:
>autovacuum = off
>wal_level = replica
>max_wal_senders = 5
>wal_keep_segments = 16
>max_locks_per_transaction = 10000
>#shared_buffers = 8GB
>#shared_buffers = 24GB
>Objective: Measure VACUUM execution time; varying shared_buffers size.
>1. Create table (ex. 10,000 tables). Insert data to tables.
>2. DELETE FROM TABLE (ex. all rows of 10,000 tables)
>3. psql -c "\timing on" (measures total execution of SQL queries)
>4. VACUUM (whole db)
>If you want to test with large number of relations,
>you may use the stored functions I used here:
>D. Results
>HEAD results
>1) 128MB shared_buffers = 48.885 seconds
>2) 8GB shared_buffers = 5 min 30.695 s
>3) 24GB shared_buffers = 14 min 13.598 s
>PATCH results
>1) 128MB shared_buffers = 42.736 s
>2) 8GB shared_buffers = 2 min 26.464 s
>3) 24GB shared_buffers = 5 min 35.848 s
>The performance significantly improved compared to HEAD,
>especially for large shared buffers.

Right, that seems nice. And it matches the expected 1:3 speedup, at
least for the larger shared_buffers cases.

Years ago I've implemented an optimization for many DROP TABLE commands
in a single transaction - instead of scanning buffers for each relation,
the code now accumulates a small number of relations into an array, and
then does a bsearch for each buffer.

Would something like that be applicable/useful here? That is, if we do
multiple TRUNCATE commands in a single transaction, can we optimize it
like this?


Tomas Vondra
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-06-11 23:13:43 Re: [PATCH] Speedup truncates of relation forks
Previous Message Daniel Gustafsson 2019-06-11 22:22:56 Re: tableam: abstracting relation sizing code