suboverflowed subtransactions concurrency performance optimize

From: "Pengchengliu" <pengchengliu(at)tju(dot)edu(dot)cn>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: suboverflowed subtransactions concurrency performance optimize
Date: 2021-08-30 08:43:24
Message-ID: 003201d79d7b$189141f0$49b3c5d0$
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi hackers,
I wrote a patch to resolve the subtransactions concurrency performance
problems when suboverflowed.

When we use more than PGPROC_MAX_CACHED_SUBXIDS(64) subtransactions per
transaction concurrency, it will lead to subtransactions performance
All backend will be stuck at acquiring lock SubtransSLRULock.

The reproduce steps in PG master branch:

1, init a cluster, append postgresql.conf as below:

max_connections = '2500'
max_files_per_process = '2000'
max_locks_per_transaction = '64'
max_parallel_maintenance_workers = '8'
max_parallel_workers = '60'
max_parallel_workers_per_gather = '6'
max_prepared_transactions = '15000'
max_replication_slots = '10'
max_wal_senders = '64'
max_worker_processes = '250'
shared_buffers = 8GB

2, create table and insert some records as below:

id integer,
val integer NOT NULL
WITH (fillfactor='50');

INSERT INTO contend (id, val)
FROM generate_series(1, 10000) AS i;


3, The script subtrans_128.sql in attachment. use pgbench with
subtrans_128.sql as below.
pgbench -d postgres -p 33800 -n -r -f subtrans_128.sql -c 500 -j 500 -T

4, After for a while, we can get the stuck result. We can query
pg_stat_activity. All backends wait event is SubtransSLRULock.
We can use pert top and try find the root cause. The result of perf top
as below:
66.20% postgres [.] pg_atomic_compare_exchange_u32_impl
29.30% postgres [.] pg_atomic_fetch_sub_u32_impl
1.46% postgres [.] pg_atomic_read_u32
1.34% postgres [.] TransactionIdIsCurrentTransactionId
0.75% postgres [.] SimpleLruReadPage_ReadOnly
0.14% postgres [.] LWLockAttemptLock
0.14% postgres [.] LWLockAcquire
0.12% postgres [.] pg_atomic_compare_exchange_u32
0.09% postgres [.] HeapTupleSatisfiesMVCC
0.06% postgres [.] heapgetpage
0.03% postgres [.] sentinel_ok
0.03% postgres [.] XidInMVCCSnapshot
0.03% postgres [.] slot_deform_heap_tuple
0.03% postgres [.] ExecInterpExpr
0.02% postgres [.] AllocSetCheck
0.02% postgres [.] HeapTupleSatisfiesVisibility
0.02% postgres [.] LWLockRelease
0.02% postgres [.] TransactionIdPrecedes
0.02% postgres [.] SubTransGetParent
0.01% postgres [.] heapgettup_pagemode
0.01% postgres [.] CheckForSerializableConflictOutNeeded

After view the subtrans codes, it is easy to find that the global LWLock
SubtransSLRULock is the bottleneck of subtrans concurrently.

When a bakcend session assign more than PGPROC_MAX_CACHED_SUBXIDS(64)
subtransactions, we will get a snapshot with suboverflowed.
A suboverflowed snapshot does not contain all data required to determine
visibility, so PostgreSQL will occasionally have to resort to pg_subtrans.
These pages are cached in shared buffers, but you can see the overhead of
looking them up in the high rank of SimpleLruReadPage_ReadOnly in the perf

To resolve this performance problem, we think about a solution which cache
SubtransSLRU to local cache.
First we can query parent transaction id from SubtransSLRU, and copy the
SLRU page to local cache page.
After that if we need query parent transaction id again, we can query it
from local cache directly.
It will reduce the number of acquire and release LWLock SubtransSLRULock

From all snapshots, we can get the latest xmin. All transaction id which
precedes this xmin, it muse has been commited/abortd.
Their parent/top transaction has been written subtrans SLRU. Then we can
cache the subtrans SLRU and copy it to local cache.

Use the same produce steps above, with our patch we cannot get the stuck
Note that append our GUC parameter in postgresql.conf. This optimize is off
in default.

The patch is base on PG master branch

Our project in, Welcome to follow us,
AntDB, AsiaInfo's PG-based distributed database product


Attachment Content-Type Size
subtrans_128.sql application/octet-stream 14.9 KB
subtrans_local_optimize.patch text/plain 24.5 KB


Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Nancarrow 2021-08-30 08:44:12 Re: Added schema level support for publication.
Previous Message Andrey V. Lepikhov 2021-08-30 08:36:38 Re: Defer selection of asynchronous subplans until the executor initialization stage