[WIP] Zipfian distribution in pgbench

From: Alik Khilazhev <a(dot)khilazhev(at)postgrespro(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Subject: [WIP] Zipfian distribution in pgbench
Date: 2017-07-07 07:45:29
Message-ID: BF3B6F54-68C3-417A-BFAB-FB4D66F2B410@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


PostgreSQL shows very bad results in YCSB Workload A (50% SELECT and 50% UPDATE of random row by PK) on benchmarking with big number of clients using Zipfian distribution. MySQL also has decline but it is not significant as it is in PostgreSQL. MongoDB does not have decline at all. And if pgbench would have Zipfian distribution random number generator, everyone will be able to make research on this topic without using YCSB.

This is the reason why I am currently working on random_zipfian function.

The bottleneck of algorithm that I use is that it calculates zeta function (it has linear complexity - https://en.wikipedia.org/wiki/Riemann_zeta_function). It my cause problems on generating huge amount of big numbers.

That’s why I added caching for zeta value. And it works good for cases when random_zipfian called with same parameters in script. For example:

\set a random_zipfian(1, 100, 1.2)
\set b random_zipfian(1, 100, 1.2)

In other case, second call will override cache of first and caching does not make any sense:

\set a random_zipfian(1, 100, 1.2)
\set b random_zipfian(1, 200, 1.4)

That’s why I have a question: should I implement support of caching zeta values for calls with different parameters, or not?

P.S. I attaching patch and script - analogue of YCSB Workload A.
Run benchmark with command:
$ pgbench -f ycsb_read_zipf.sql -f ycsb_update_zipf.sql

On scale = 10(1 million rows) it gives following results on machine with 144 cores(with synchronous_commit=off):
nclients tps
1 8842.401870
2 18358.140869
4 45999.378785
8 88713.743199
16 170166.998212
32 290069.221493
64 178128.030553
128 88712.825602
256 38364.937573
512 13512.765878
1000 6188.136736

Attachment Content-Type Size
ycsb_read_zipf.sql application/sql 102 bytes
pgbench-zipf-01v.patch application/octet-stream 5.9 KB
ycsb_update_zipf.sql application/sql 105 bytes
unknown_filename text/plain 118 bytes


Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Banck 2017-07-07 07:54:11 Re: New partitioning - some feedback
Previous Message K S, Sandhya (Nokia - IN/Bangalore) 2017-07-07 07:41:03 Re: Postgres process invoking exit resulting in sh-QUIT core