less expensive pg_buffercache on big shmem

From: Ivan Kartyshov <i(dot)kartyshov(at)postgrespro(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Subject: less expensive pg_buffercache on big shmem
Date: 2016-08-31 14:57:28
Message-ID: f9d6cab2-73a7-7a84-55a8-07dcb8516ae5@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi hackers,

Recently I have finished my work on a patch for pg_buffercache contrib,
I think it's time to share my results.


I want to offer you the implementation that allows to decrease system
workload by
partially sacrificing (fully snapshot consistency) data consistency.
Sometimes we do not need full data consistency, for example on
quantitative rather than qualitative analysis of memory contents, or
when we want to catch insufficient memory resources or how often
relation is used.

Problem description

Currently, the pg_buffercache v1.1 and prior takes an exclusive lock on
all shared buffers, which greatly affects system performance.
Usually we use pg_buffercache to find out why DB is working slower than
expected or examine what occupies the entire memory. So when we run
pg_buffercache on such system, we make it even slower.


Vanilla implementation contains loop which collecting statistic from
whole shared memory acquire, read and release Spinlocks one by one, page
by page while holding LWLock.

V1.2 implementation contains flexible loop which can collect shared
memory statistic using three different methods:
1) with holding LWLock only on one partition of shared memory
(semiconsistent method)
2) without LWLocks (nonconsistent method),
3) or in vanilia way (consistent method)

The aforementioned allow us to launch pg_buffercache in the three
different ways.
Each of them have some advantages and some disadvantages:

+ 100% consistency of shared memory snapshot
- Slowdown the system with whole shared memory exclusive lock

+ Faster than consistent method
+ Mostly doesn`t affect on the system load
- Speed of taking that snapshot is low
The fastest
+ Doesn`t noticeably affects on the systems
- <3% lost of snapshot consistency

What works

Actually, it work well even on big load, but of course there might be
things I've
VIEW pg_buffercache_cons
VIEW pg_buffercache_noncons
VIEW pg_buffercache_semicons

Examples from docs in new realization:
SELECT c.relname, count(*) AS buffers FROM pg_buffercache_noncons b
INNER JOIN pg_class c ON b.relfilenode = pg_relation_filenode(c.oid) AND
b.reldatabase IN (0, (SELECT oid FROM pg_database WHERE datname =
current_database())) GROUP BY c.relname ORDER BY 2 DESC LIMIT 10;

SELECT c.relname, count(*) AS buffers FROM pg_buffercache_semicons b
INNER JOIN pg_class c ON b.relfilenode = pg_relation_filenode(c.oid) AND
b.reldatabase IN (0, (SELECT oid FROM pg_database WHERE datname =
current_database())) GROUP BY c.relname ORDER BY 2 DESC LIMIT 10;

Testing the implementation

How implementation tested:
1) Start server
2) Make pgbench tps
pgbench -c 250 -s 1000 -T 200 -P1
3) Compare how tps sags under load if:
SELECT count(*) FROM pg_buffercache_cons;
SELECT count(*) FROM pg_buffercache_semicons;
SELECT count(*) FROM pg_buffercache_noncons;

This test was made on server (server parameters)
Model name: Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz
CPU(s): 144
Socket(s): 4
Shared_buffers: 200GB

Results of testing

Our DBA team obtained the following results:
* 10% faster then consistent method
* doesn`t noticeably affects on the systems
* the maximum loss of accuracy was less then 3%* ( in most situation it
is permissible accuracy loss )

* 5 time slower then nonconsistent
* made less affects on system compared to consistent

Overall results:
Our clients was pleased with this implementation.
Implementation is made with backward compatibility, as a conclusion old
pg_buffercache v1.1 queries will work well.
Semiconsistent show results approaching to nonconsistent on SELECTONLY

* this values were obtained from our DBA tests.

What can be better

It is unclear how to optimize the semiconsistent method to make it
faster, and reduce temporary effect that appears from time to time.

I will be glad to see your feedback!

Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
pg_buffercache_v1_2.patch text/x-patch 12.3 KB


Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-08-31 15:14:33 Re: Missing checks when malloc returns NULL...
Previous Message Craig Ringer 2016-08-31 14:54:10 Re: make async slave to wait for lsn to be replayed