old_snapshot_threshold bottleneck on replica

From: Maxim Orlov <orlovmg(at)gmail(dot)com>
To: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: old_snapshot_threshold bottleneck on replica
Date: 2023-01-23 14:40:15
Message-ID: CACG=ezYV+EvO135fLRdVn-ZusfVsTY6cH1OZqWtezuEYH6ciQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

One of our customers stumble onto a significant performance degradation
while running multiple OLAP-like queries on a replica.
After some investigation, it became clear that the problem is in accessing
old_snapshot_threshold parameter.

Accessing old_snapshot_threshold parameter is guarded by mutex_threshold.
This is not a problem on primary
server, since we rarely call GetOldSnapshotThresholdTimestamp:

5028 void
5029 TestForOldSnapshot_impl(Snapshot snapshot, Relation relation)
5030 {
5031 ····if (RelationAllowsEarlyPruning(relation)
5032 ········&& (snapshot)->whenTaken < GetOldSnapshotThresholdTimestamp())
5033 ········ereport(ERROR,
5034 ················(errcode(ERRCODE_SNAPSHOT_TOO_OLD),
5035 ················ errmsg("snapshot too old")));

But in case of a replica, we have to call GetOldSnapshotThresholdTimestamp
much often. So, this become a
bottleneck. The customer solve this issue by setting old_snapshot_threshold
to 0. But, I think, we can
do something about it.

Some more investigation:

-- On primary --
$ ./bin/psql postgres -c "create database benchmark"
CREATE DATABASE
$ ./bin/pgbench -i -Uorlov -s300 benchmark
dropping old tables...
NOTICE: table "pgbench_accounts" does not exist, skipping
...
creating tables...
generating data (client-side)...
30000000 of 30000000 tuples (100%) done (elapsed 142.37 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 177.67 s (drop tables 0.00 s, create tables 0.01 s, client-side
generate 144.45 s, vacuum 0.59 s, primary keys 32.61 s).

-- On secondary --
$ touch 1.sql
$ vim 1.sql
$ cat 1.sql
\set bid random(1, 300)
BEGIN;
SELECT sum(aid) FROM pgbench_accounts where bid = :bid GROUP BY bid;
END;
$ ./bin/pgbench -f 1.sql -p5433 -Uorlov -j10 -c100 -T720 -P1 -n benchmark
pgbench (16devel)
progress: 1.0 s, 0.0 tps, lat 0.000 ms stddev 0.000, 0 failed
...
progress: 20.0 s, 0.0 tps, lat 0.000 ms stddev 0.000, 0 failed

$ perf record -F 99 -a -g --call-graph=dwarf sleep 5
$ perf script --header --fields comm,pid,tid,time,event,ip,sym,dso > file
$ grep s_lock file | wc -l

3486

My proposal is to use atomic for threshold_timestamp and threshold_xid. PFA
0001 patch.
With patch 0001 we got:

$ grep s_lock file2 | wc -l
8

Maybe, we shall go farther and remove mutex_threshold here? This will lead
to inconsistency of
threshold_timestamp and threshold_xid, but is this really a problem?

Thoughts?

--
Best regards,
Maxim Orlov.

Attachment Content-Type Size
0001-PGPRO-7624-use-atomic-old_snapshot_threshold.patch application/octet-stream 4.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema 2023-01-23 14:49:15 Re: run pgindent on a regular basis / scripted manner
Previous Message Dmitry Dolgov 2023-01-23 14:25:54 Re: Schema variables - new implementation for Postgres 15 (typo)